Posts

Showing posts from March, 2026

Perfecting my recipe for Java and Docker

 I have a lot of docker builds that I do hadoop/ livy/hive. I have a few broad challenges I bounce between a developer and a user. At times I want something off the shelf with minor customization. Other times I want to patch and rebuild. The applications can be fairly sizable in terms of dependencies One of the biggest challenges I have run into is complications involving docker layers. The idea behind docker layers is that if you take a given docker file: RUN mkdir /abc RUN mkdir /def and place the frequently changed things closer to the bottom, then the upperlayers are reusable. This is true, but there are big challenges. In a large java project that you would role into an assembly jar a one line change to a single file in 1000 source files invalidates the entire JAR Building requires many packages that do not make it into the final product, the ~/.m2/repo, they periodically invalidate. My .m2 is 802 MB. Build processes frequently retouch files changing timestamps and scratch dir...

Guided Choice: Get only the answer you want with no fluff

 In my last blog we refactored the text sampling code in deliverance . I did that to prepare the code to take on something clever. You know LLM sometimes give you the answer inside a sea of text and never stop talking (like me). A solution to that is "structured outputs", which helps you control what the inference engine will produce. There are several forms of structured outputs. A simple one is guided choice . Effectively you give the inference engine a prompt and a list of choices. It should only answer with one of them.  Something like:  prompt= "What is the best month for vacation",  choices = ["January", "February"]  So lets code it up! As always here is the code   Guided Choice commit One thing to think about ahead of time. Call it a "trick". LLMs don't answer in words, they answer in tokens.  If you ask an LLM a question like "Who is the best NFL team". It might have "Giants" in the vocabulary, or it migh...

Large Language Models Termperature and relative performance

 When you read about performance and tuning for large language models the term Temperature comes up often. I find a lot of interesting description of it from "It controls creativity", "How hard it thinks". What I like to say, "temperature at 0.0 is prone to over-fit".  I avoid "high temperature is  more likely to hallucinate," as I feel the 0.0 'over fit' looks a lot like a hallucination as well.   I rebuilt some of the deliverance code for sampling to give it an Object Oriented Design style face-lift, and prepare for adding repetition penalty.  As always feel free to look at the code here:  https://github.com/edwardcapriolo/deliverance/commit/0162c1daa07cba6b43d4e4f75cb95f50b0ff11b2 I was never overjoyed with was the size of the AbstractModel class . Many of the features are tightly coupled, I wanted to get the sampling into it's own class. It is not 100% clean as we still walk back into the AbstractModel to get config values and ot...