Posts

Exclude Top Choice takes VS repetition penalty

Image
 A few days ago I hacked top_logprobs into deliverance:   Logprobs shows you what other tokens were close to being chosen and the logprobs (logarithm of the probability) of how close they were.  When I had never done something like this before and the math turned out to be not so bad. Most of what I learned I got from this excellent article  and this one .  public static void logSumExpTensor (AbstractTensor result, AbstractTensor input) { float logsumexp = ( float ) logSumExp (input); for ( int i = 0 ; i < input.size(); i++) { float v = input.get( 0 , i); result.set(v - logsumexp, 0 , i); } } public static double logSumExp (AbstractTensor x){ float sum = 0.0f ; for ( int i = 0 ; i < x.size(); i++) { sum += ( float ) FastMath. exp (x.get( 0 , i)); } return ( float ) FastMath. log (sum); } All this math stuff had me a little bit more confident editing the Sampler code. So I decided to keep going. One thing ...

Perfecting my recipe for Java and Docker

 I have a lot of docker builds that I do hadoop/ livy/hive. I have a few broad challenges I bounce between a developer and a user. At times I want something off the shelf with minor customization. Other times I want to patch and rebuild. The applications can be fairly sizable in terms of dependencies One of the biggest challenges I have run into is complications involving docker layers. The idea behind docker layers is that if you take a given docker file: RUN mkdir /abc RUN mkdir /def and place the frequently changed things closer to the bottom, then the upperlayers are reusable. This is true, but there are big challenges. In a large java project that you would role into an assembly jar a one line change to a single file in 1000 source files invalidates the entire JAR Building requires many packages that do not make it into the final product, the ~/.m2/repo, they periodically invalidate. My .m2 is 802 MB. Build processes frequently retouch files changing timestamps and scratch dir...

Guided Choice: Get only the answer you want with no fluff

 In my last blog we refactored the text sampling code in deliverance . I did that to prepare the code to take on something clever. You know LLM sometimes give you the answer inside a sea of text and never stop talking (like me). A solution to that is "structured outputs", which helps you control what the inference engine will produce. There are several forms of structured outputs. A simple one is guided choice . Effectively you give the inference engine a prompt and a list of choices. It should only answer with one of them.  Something like:  prompt= "What is the best month for vacation",  choices = ["January", "February"]  So lets code it up! As always here is the code   Guided Choice commit One thing to think about ahead of time. Call it a "trick". LLMs don't answer in words, they answer in tokens.  If you ask an LLM a question like "Who is the best NFL team". It might have "Giants" in the vocabulary, or it migh...

Large Language Models Termperature and relative performance

 When you read about performance and tuning for large language models the term Temperature comes up often. I find a lot of interesting description of it from "It controls creativity", "How hard it thinks". What I like to say, "temperature at 0.0 is prone to over-fit".  I avoid "high temperature is  more likely to hallucinate," as I feel the 0.0 'over fit' looks a lot like a hallucination as well.   I rebuilt some of the deliverance code for sampling to give it an Object Oriented Design style face-lift, and prepare for adding repetition penalty.  As always feel free to look at the code here:  https://github.com/edwardcapriolo/deliverance/commit/0162c1daa07cba6b43d4e4f75cb95f50b0ff11b2 I was never overjoyed with was the size of the AbstractModel class . Many of the features are tightly coupled, I wanted to get the sampling into it's own class. It is not 100% clean as we still walk back into the AbstractModel to get config values and ot...

Securiting Hadoop HA Yarn: Part 1 TLS & PKI

Image
 First lets go over the basics of the Hadoop stack. The storage or filesystem HDFS is  separated from the compute Yarn. The hadoop-yarn handles container orchestration. It has two key components the ResouceManager (RM) which is the leader , and the NodeManager which is the worker . The design is a common tried and true approach of having the leader handle handle a small number of discrete tasks and scale out the workers do hundreds or thousands of nodes.  RM <- - - [ NM1, NM2, RM3, NM4, NM... ]   The design is not peer-to-peer. The NodeManager nodes startup and announce themselves to the RM. When an  Application  or Job  is submitted to the group the RM enforces quota's and ultimately provides the mechanism for Application to request containers and use them for computation.  I have been working hard an a series of compositionsfor running Hadoop on containers (docker/k8s) and on bare metal (ansible). You can find all the material (githu...

Testing resillence4j retry with mockable fault injection

 One thing that separates a well engineered project from a so-so one is attention payed to error handling and retry. This isn't always easy as it sounds, first you need to taxonomize Exceptions, then come up with appropriate retry and backoff strategies. To that end, I created some interfaces and classes to make a good showing of how to do this.  The link to the code is here but rthe blog  we will walk it step by step. Resilence4j (https://resilience4j.readme.io/docs/getting-started) helps with a great deal of this by making a small purpose build librrary with nice building blocks like retry, bulkhead, and circuit breaker.. If you read about hystrix from nextflix years back this library grew from that one. You can not go in "half-baked": Some things are useless to retry such as a NullPointerException based on bad input. One of my favorite ways to design APIs is to introduce a clear exception hierarchy, things you can retry and things you can not. Below I used Interf...

Java Try monad using java 17 sealed interfaces

One pattern I find very effective is using sealed interfaces or classes. They work really well in place of Enum because the trait can define an implementation and the compiler can ensure that if a switch (java) or match (scala) statement references the type that every possible type is handled.   Lets look at an implementation of the Try monad that just  landed in maven central  . For those not familiar with scala's Try .   val dividend = Try ( StdIn .readLine( "Enter an Int that you'd like to divide:\n" ).toInt) val divisor = Try ( StdIn .readLine( "Enter an Int that you'd like to divide by:\n" ).toInt) val problem = dividend.flatMap(x => divisor.map(y => x/y)) problem match { case Success (v) => println( "Result of " + dividend.get + "/" + divisor.get + " is: " + v) Success (v) case Failure (e) => println( "You must've divided by zero or entered something that...