<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://balajeerc.info/feed.xml" rel="self" type="application/atom+xml" /><link href="https://balajeerc.info/" rel="alternate" type="text/html" /><updated>2026-04-08T06:15:55+00:00</updated><id>https://balajeerc.info/feed.xml</id><title type="html">Balajee Ramachandran</title><subtitle>Programmer. Blogger. Nice Guy.</subtitle><author><name>Balajee RamaChandran</name></author><entry><title type="html">Figma Prototyping No Longer Makes Sense</title><link href="https://balajeerc.info/Figma-Prototyping-No-Longer-Makes-Sense/" rel="alternate" type="text/html" title="Figma Prototyping No Longer Makes Sense" /><published>2026-02-10T00:00:00+00:00</published><updated>2026-02-10T00:00:00+00:00</updated><id>https://balajeerc.info/Figma-Prototyping-No-Longer-Makes-Sense</id><content type="html" xml:base="https://balajeerc.info/Figma-Prototyping-No-Longer-Makes-Sense/"><![CDATA[<p>I head Product and Growth at <a href="https://sensibull.com/">Sensibull</a> and as part of my job, I routinely make mockups/prototypes in Figma to explore new features or modifications to existing features.</p>

<p>Note that here I am only talking about low-fidelity prototypes. I am not talking about high fidelity mockups that product designers make further down the feature development process. But even at these preliminary stages, product prototyping can get quite detailed. These prototypes are what I and other product managers in the team create so that we can chew on ideas, figure out corner cases and the impact of a new feature on other aspects of our application.</p>

<p>The way we used to do this is using Figma. We import a ‘Whiteboarding’ component library and drag and drop elements to make quick prototypes.</p>

<p>However, of late I was working on a rather big feature, and was constantly struggling with the Figma editor. To be clear, the Figma editor is good, but there is still the whole process of having to set up the components using a drag and drop process. And as much as we’d like to tell ourselves that during the prototyping phase, only the core idea matters, the undeniable fact is that a hideous looking prototype does not do you any favours when trying to sell a new concept to the team. So then you start putting in a bit of effort to make things look acceptable. And that ends up a time sink where you are playing Barbie with your prototype whereas you should be thinking about the idea you are trying to communicate to the team.</p>

<p>Then comes the whole aspect of stitching together a coherent clickable prototype in Figma. You have to ‘stitch’ together various frames carefully and add necessary back buttons and so forth. At the end of it, you end up with a maze of screens with interactions going from one to the next and back. This can get quite hairy quite fast.</p>

<p>After a few hours of frustration, I decided I’d just vibe code it using Kilo code.</p>

<p>The change in my iteration speed was just shocking.</p>

<p>I drew pen and paper mocks of the new feature I was planning to build, took photos of it on my phone camera and gave it to the Agent who made a very pretty looking initial version of the prototype. After that, making changes was just a matter of prompting appropriately.</p>

<p>This process unlocks a whole lot of super-powers which need to be seen/experienced to be believed. Previously, the initial prototype, strive as much as we might to capture all the nuances, would invariably end up missing some glaring corner cases. However, being able to play with a live prototype immediately surfaces these rough edges in flow logic/product abstractions. And this lets you incorporate some really nuanced considerations in your spec.</p>

<p>Communication of the idea with team members now becomes ridiculously simple. The buttons are clickable, the modals come up as necessary.</p>

<p>Furthermore, generating a draft of the written product spec is just one prompt away. The LLM knows your entire codebase and can generate all necessary details from it.</p>

<p>The absolute kicker is how you can use the LLM itself for brainstorming and checking things you may not have considered. I literally ask it things like “What are elements/aspects I may be missing in this screen?” and I typically get one or 2 suggestions that end up catching tiny details that I missed.</p>

<p>This also becomes a much better input for the tech team to evaluate the feasibility of implementing the sometimes fanciful features that the product team dreams up. There’s nothing like having a live prototype that you can interact and play around with that helps bring out all the hairy corner cases that we are about to run into. That’s what Figma prototypes are supposed to do, but the vibe coded “live” prototype is that on steroids.</p>

<p>Now for some caveats:</p>

<ol>
  <li>A lot of my experience above is coloured by my background. I spent the first 16 years of my life writing code. I have been working in Product and Growth only for the last 5 years. And even in the last 5 years, I was always working on one hobby project at least. I code every weekend. A product manager without that background might get mixed results, though the fabulous comprehension that frontier models show seems to make this gap smaller all the time.</li>
  <li>This is throw-away code. No one’s going to use this in production.</li>
  <li>Finally, as mentioned in my <a href="/Use-Deterministic-Guardrails-for-your-LLM-Agents/">previous post</a>, I have assembled a rather extensive scaffolding that lets me keep my code reasonably sane (small functions, small files, no code duplication, enforced modularisation etc.). In my observation, that also helps increase the probability of my prompt being transformed into a feature change or bug fix as I expect the agent to.</li>
</ol>

<p>Overall, I suppose this is a great time to be building.</p>]]></content><author><name>Balajee RamaChandran</name></author><summary type="html"><![CDATA[I head Product and Growth at Sensibull and as part of my job, I routinely make mockups/prototypes in Figma to explore new features or modifications to existing features.]]></summary></entry><entry><title type="html">Use Deterministic Guardrails For Your Llm Agents</title><link href="https://balajeerc.info/Use-Deterministic-Guardrails-for-your-LLM-Agents/" rel="alternate" type="text/html" title="Use Deterministic Guardrails For Your Llm Agents" /><published>2026-01-26T00:00:00+00:00</published><updated>2026-01-26T00:00:00+00:00</updated><id>https://balajeerc.info/Use-Deterministic-Guardrails-for-your-LLM-Agents</id><content type="html" xml:base="https://balajeerc.info/Use-Deterministic-Guardrails-for-your-LLM-Agents/"><![CDATA[<blockquote>
  <p>NOTE: This post, every bit of it, was entirely written by a human. Please feel free to blame the author rather than an LLM for any bits you suspect is poorly written / inane :)</p>

</blockquote>

<p>This post is not to pontificate in favour of agentic LLMs. Instead, in this post I’ll be focusing on how to avoid an all too common occurrence: your vibe-coded/LLM agent coded project from turning into unreadable mush. Because, as much as I am an LLM/coding agent optimist, this is something that just cannot be denied.</p>

<p>Here are a few observations I found using agentic AI for coding:</p>

<ol>
  <li>It copy pastes code a lot. (I mean, a lot!)</li>
  <li>It does not remove old unused code.</li>
  <li>It does not follow any kind of code modularization leading to terrible spaghetti code.</li>
  <li>It writes really long and complex functions sometimes. I mean 500 line for loops with nested ifs and further nested loops within them.</li>
  <li>Files grow really really long. Left to themselves, LLMs generate projects with 90% of files of size less than 200 lines, and some 4-5 monstrous files containing over 2000 lines each.</li>
</ol>

<p>These issues are less pronounced in the ‘first pass’ where you just gave it a prompt and got a decent working v1. However, as you keep making changes and iterating, the above problems start compounding fast.</p>

<p>With some trial and error what I’ve found is that having a series of linters as guardrails dramatically improves the code quality. And while this is anecdotal, it appears that improving code quality as mentioned below also makes the success rate of bug fixing and new feature addition through prompting an agent higher.</p>

<p>Furthermore, I find that the code generated with the guardrails mentioned below is easier to read and debug when you need to get down and fix some hairy nuanced issue that the LLM agent is unable to fix by itself.</p>

<p>Hopefully, this post brings your attention to the fact that some advanced deterministic linters/static code analyzers already exist for your language ecosystem. Tools like <code class="language-plaintext highlighter-rouge">clippy</code> for rust and <code class="language-plaintext highlighter-rouge">golangci-lint</code> for Go have most of these tools already built-in or available as plugins. So it’s just a matter of setting up config files appropriately.</p>

<p>These days, I only code as a hobby. I mostly make prototypes in relation to my work in the Product team. So, please feel free to treat my claims with a pinch of salt since it’s not coming from a developer who still pushes code to production.</p>

<p>The suggestions below are influenced by my experience creating Typescript prototypes. However, I believe most of the suggestions below translate to other language ecosystems as well.</p>

<ol>
  <li>
    <p><strong>Type-safe compiler</strong></p>

    <p>I think this is an absolute must. Agentic code generation most certainly needs type safety. It eliminates a whole class of bugs. I’m not going to belabor this point in this post. If you are doing agentic development, you likely do need static type checking.</p>
  </li>
  <li>
    <p><strong>Enforce Basic Linting</strong></p>

    <p>Most mature language ecosystems have at least one great linting tool you can find. Enforce as many of the strict rules as possible. Whether it’s <code class="language-plaintext highlighter-rouge">eslint</code>, <code class="language-plaintext highlighter-rouge">golangci-lint</code>, <code class="language-plaintext highlighter-rouge">clippy</code>… there are a wealth of options here which you likely already know about and likely probably use already.</p>
  </li>
  <li>
    <p><strong>Keep files small</strong></p>

    <p>LLM generated code, especially those that are vibe coded without careful review, end up having some really small files with a couple of types or functions defined in them, and 2-3 mega files each spanning several thousands of lines.</p>

    <p>Enforce linter rules that mandate that files cannot go over a certain threshold in length, maybe between 500 and 750 lines at most. <code class="language-plaintext highlighter-rouge">eslint</code> for example lets you enforce this using the <code class="language-plaintext highlighter-rouge">max-lines</code> rule. While doing this, note that you might need to carve out exceptions for some files which might contain code, but are essentially configuration or data.</p>
  </li>
  <li>
    <p><strong>Enable cyclomatic complexity checks</strong></p>

    <p>This is the first of the non-obvious options that I stumbled on recently which I have found useful. Basically, cyclomatic complexity while not perfect does a decent job of clamping down on code where there’s a lot of branching code. When applied to a function, it measures the number of linearly independent paths in it. The <a href="https://eslint.org/docs/latest/rules/complexity">eslint docs</a> do a good job of explaining it.</p>

    <p>This along with <code class="language-plaintext highlighter-rouge">eslint</code>’s <a href="https://eslint.org/docs/latest/rules/max-depth">max depth</a> rules make the functions generated easy to read.</p>
  </li>
  <li>
    <p><strong>Unused code removal</strong></p>

    <p>When you do a series of refactors or deep-rooted feature changes in a vibe-coded project, agentic LLMs leave a lot of legacy unused code in the code base. Another related issue is the fact that agentic LLMs will not, unless expressly prompted, remove third party dependencies from the project.</p>

    <p>For the Typescript/JS ecosystem, tools like <a href="https://knip.dev/">knip</a> help eliminate this by removing unused exports, files and dependencies.</p>
  </li>
  <li>
    <p><strong>Prevent Code Duplication</strong></p>

    <p>Code refactors by agentic LLMs always results in a lot of copied code. I found that <a href="https://github.com/kucherenko/jscpd">jscpd</a> is really good at flagging these duplications. Unlike many of the other linters mentioned in this post, <code class="language-plaintext highlighter-rouge">jscpd</code> is language agnostic. So it can probably help find code duplicates in your code base already.</p>
  </li>
  <li>
    <p><strong>Enforcing modularization / folder dependency rules</strong></p>

    <p>This was for me the biggest revelation regarding the state of static code checkers available. I suspect this is less a problem in some ecosystems like Rust where you need to break up large code bases into multiple cargo packages. But if you, like me are working in a big monolith where the only code modularization primitive is a folder in the file-system, then a tool like <a href="https://github.com/sverweij/dependency-cruiser">dependency-cruiser</a> is a god-send.</p>

    <p>Dependency cruiser lets you define fine grained directory-level modularisation rules as shown below:</p>

    <div class="language-jsx highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="nx">module</span><span class="p">.</span><span class="nx">exports</span> <span class="o">=</span> <span class="p">{</span>
   <span class="na">forbidden</span><span class="p">:</span> <span class="p">[</span>
     <span class="c1">// ============================================</span>
     <span class="c1">// SECTION 1: Feature-Based Architecture Rules</span>
     <span class="c1">// ============================================</span>
     <span class="p">{</span>
       <span class="na">name</span><span class="p">:</span> <span class="dl">'</span><span class="s1">no-feature-to-feature-imports</span><span class="dl">'</span><span class="p">,</span>
       <span class="na">severity</span><span class="p">:</span> <span class="dl">'</span><span class="s1">error</span><span class="dl">'</span><span class="p">,</span>
       <span class="na">comment</span><span class="p">:</span>
         <span class="dl">'</span><span class="s1">Features should not import from other features directly. Use shared/ for cross-feature code. </span><span class="dl">'</span> <span class="o">+</span>
         <span class="dl">'</span><span class="s1">This maintains feature independence and prevents tight coupling.</span><span class="dl">'</span><span class="p">,</span>
       <span class="na">from</span><span class="p">:</span> <span class="p">{</span>
         <span class="na">path</span><span class="p">:</span> <span class="dl">'</span><span class="s1">^src/(builder|home|community|coverPage|performanceResults)/</span><span class="dl">'</span><span class="p">,</span>
       <span class="p">},</span>
       <span class="na">to</span><span class="p">:</span> <span class="p">{</span>
         <span class="na">path</span><span class="p">:</span> <span class="dl">'</span><span class="s1">^src/(builder|home|community|coverPage|performanceResults)/</span><span class="dl">'</span><span class="p">,</span>
         <span class="na">pathNot</span><span class="p">:</span> <span class="p">[</span>
           <span class="c1">// Allow imports within same feature</span>
           <span class="dl">'</span><span class="s1">^src/$1/</span><span class="dl">'</span><span class="p">,</span>
         <span class="p">],</span>
       <span class="p">},</span>
     <span class="p">},</span>
     <span class="p">{</span>
       <span class="na">name</span><span class="p">:</span> <span class="dl">'</span><span class="s1">no-shared-import-features</span><span class="dl">'</span><span class="p">,</span>
       <span class="na">severity</span><span class="p">:</span> <span class="dl">'</span><span class="s1">error</span><span class="dl">'</span><span class="p">,</span>
       <span class="na">comment</span><span class="p">:</span>
         <span class="dl">'</span><span class="s1">Shared code should not import from feature folders. Shared code should be truly generic </span><span class="dl">'</span> <span class="o">+</span>
         <span class="dl">'</span><span class="s1">and not depend on any specific feature implementation.</span><span class="dl">'</span><span class="p">,</span>
       <span class="na">from</span><span class="p">:</span> <span class="p">{</span>
         <span class="na">path</span><span class="p">:</span> <span class="dl">'</span><span class="s1">^src/shared/</span><span class="dl">'</span><span class="p">,</span>
       <span class="p">},</span>
       <span class="na">to</span><span class="p">:</span> <span class="p">{</span>
         <span class="na">path</span><span class="p">:</span> <span class="dl">'</span><span class="s1">^src/(builder|home|community|coverPage|performanceResults)/</span><span class="dl">'</span><span class="p">,</span>
       <span class="p">},</span>
     <span class="p">},</span>
    
     <span class="c1">// ============================================</span>
     <span class="c1">// SECTION 2: Layer Architecture Rules (Within Features)</span>
     <span class="c1">// ============================================</span>
     <span class="p">{</span>
       <span class="na">name</span><span class="p">:</span> <span class="dl">'</span><span class="s1">no-store-import-components</span><span class="dl">'</span><span class="p">,</span>
       <span class="na">severity</span><span class="p">:</span> <span class="dl">'</span><span class="s1">error</span><span class="dl">'</span><span class="p">,</span>
       <span class="na">comment</span><span class="p">:</span>
         <span class="dl">'</span><span class="s1">Store modules should not import from components. The store is the data layer and </span><span class="dl">'</span> <span class="o">+</span>
         <span class="dl">'</span><span class="s1">should be independent of UI concerns.</span><span class="dl">'</span><span class="p">,</span>
       <span class="na">from</span><span class="p">:</span> <span class="p">{</span>
         <span class="na">path</span><span class="p">:</span> <span class="dl">'</span><span class="s1">^src/(builder|shared)/store/</span><span class="dl">'</span><span class="p">,</span>
       <span class="p">},</span>
       <span class="na">to</span><span class="p">:</span> <span class="p">{</span>
         <span class="na">path</span><span class="p">:</span> <span class="dl">'</span><span class="s1">^src/(builder|home|community|coverPage|performanceResults|shared)/components/</span><span class="dl">'</span><span class="p">,</span>
       <span class="p">},</span>
     <span class="p">},</span>
     <span class="p">{</span>
       <span class="na">name</span><span class="p">:</span> <span class="dl">'</span><span class="s1">no-store-import-hooks</span><span class="dl">'</span><span class="p">,</span>
       <span class="na">severity</span><span class="p">:</span> <span class="dl">'</span><span class="s1">error</span><span class="dl">'</span><span class="p">,</span>
       <span class="na">comment</span><span class="p">:</span>
         <span class="dl">'</span><span class="s1">Store modules should not import from hooks. The store provides data that hooks can use, </span><span class="dl">'</span> <span class="o">+</span>
         <span class="dl">'</span><span class="s1">but store should not depend on hooks.</span><span class="dl">'</span><span class="p">,</span>
       <span class="na">from</span><span class="p">:</span> <span class="p">{</span>
         <span class="na">path</span><span class="p">:</span> <span class="dl">'</span><span class="s1">^src/(builder|shared)/store/</span><span class="dl">'</span><span class="p">,</span>
       <span class="p">},</span>
       <span class="na">to</span><span class="p">:</span> <span class="p">{</span>
         <span class="na">path</span><span class="p">:</span> <span class="dl">'</span><span class="s1">^src/(builder|home|community|coverPage|performanceResults|shared)/hooks/</span><span class="dl">'</span><span class="p">,</span>
       <span class="p">},</span>
     <span class="p">},</span>
     <span class="p">{</span>
       <span class="na">name</span><span class="p">:</span> <span class="dl">'</span><span class="s1">no-types-import-runtime</span><span class="dl">'</span><span class="p">,</span>
       <span class="na">severity</span><span class="p">:</span> <span class="dl">'</span><span class="s1">error</span><span class="dl">'</span><span class="p">,</span>
       <span class="na">comment</span><span class="p">:</span>
         <span class="dl">'</span><span class="s1">Type definition files should only contain types. They should not import runtime code </span><span class="dl">'</span> <span class="o">+</span>
         <span class="dl">'</span><span class="s1">from components, hooks, store, or data.</span><span class="dl">'</span><span class="p">,</span>
       <span class="na">from</span><span class="p">:</span> <span class="p">{</span>
         <span class="na">path</span><span class="p">:</span> <span class="dl">'</span><span class="s1">^src/(builder|home|community|coverPage|performanceResults|shared)/types/</span><span class="dl">'</span><span class="p">,</span>
       <span class="p">},</span>
       <span class="na">to</span><span class="p">:</span> <span class="p">{</span>
         <span class="na">path</span><span class="p">:</span> <span class="dl">'</span><span class="s1">^src/(builder|home|community|coverPage|performanceResults|shared)/(components|hooks|store|data)/</span><span class="dl">'</span><span class="p">,</span>
       <span class="p">},</span>
     <span class="p">},</span>
  <span class="p">],</span>
 <span class="p">};</span>
    
</code></pre></div>    </div>

    <p>As an example, note how the configuration forces the file hierarchy to not import between features directly. When flagging the error, it also asks to <code class="language-plaintext highlighter-rouge">use shared</code> directory for such cases. The LLM agent now has the hint it needs to do the right thing in this case which is to put code that is used across multiple features into the shared directory.</p>

    <p>One note of caution here. One pitfall that the above <code class="language-plaintext highlighter-rouge">shared</code> directory rule can get you into is the <code class="language-plaintext highlighter-rouge">shared</code> directory itself growing too large and collecting all kinds of unrelated cruft.</p>

    <p>In my case, I was able to use dependency-cruiser as a library in a custom script to ensure that a shared file must at least be used in 2 features (hence enforcing its ‘shared’ status). Just something to watch out for.</p>
  </li>
  <li>
    <p><strong>Security Checks</strong></p>

    <p>There are a quite a few static checkers that analyze code for security vulnerabilities. I use <code class="language-plaintext highlighter-rouge">semgrep</code> but there are quite a few options to choose from in this space (some of them with paid tiers like semgrep itself).</p>
  </li>
</ol>

<h2 id="putting-it-all-together">Putting it all together</h2>

<p>Once you figure out/assemble the list of linter configuration rules you’d like, it’s time to enforce it. All the agentic tools have some kind of <code class="language-plaintext highlighter-rouge">rules</code> directory where you can mandate it to run a check once it finishes a task. In my case, I chain all the checks together in my <code class="language-plaintext highlighter-rouge">package.json</code> like so:</p>

<div class="language-jsx highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span>
  <span class="dl">"</span><span class="s2">name</span><span class="dl">"</span><span class="p">:</span> <span class="dl">"</span><span class="s2">my-prototype</span><span class="dl">"</span><span class="p">,</span>
  <span class="dl">"</span><span class="s2">private</span><span class="dl">"</span><span class="p">:</span> <span class="kc">true</span><span class="p">,</span>
  <span class="dl">"</span><span class="s2">version</span><span class="dl">"</span><span class="p">:</span> <span class="dl">"</span><span class="s2">0.0.0</span><span class="dl">"</span><span class="p">,</span>
  <span class="dl">"</span><span class="s2">type</span><span class="dl">"</span><span class="p">:</span> <span class="dl">"</span><span class="s2">module</span><span class="dl">"</span><span class="p">,</span>
  <span class="dl">"</span><span class="s2">scripts</span><span class="dl">"</span><span class="p">:</span> <span class="p">{</span>
    <span class="dl">"</span><span class="s2">dev</span><span class="dl">"</span><span class="p">:</span> <span class="dl">"</span><span class="s2">(fuser -k 5173/tcp || true) &amp;&amp; vite</span><span class="dl">"</span><span class="p">,</span>
    <span class="dl">"</span><span class="s2">build</span><span class="dl">"</span><span class="p">:</span> <span class="dl">"</span><span class="s2">vite build</span><span class="dl">"</span><span class="p">,</span>
    <span class="dl">"</span><span class="s2">lint</span><span class="dl">"</span><span class="p">:</span> <span class="dl">"</span><span class="s2">eslint . --max-warnings=0</span><span class="dl">"</span><span class="p">,</span>
    <span class="dl">"</span><span class="s2">check:code-duplication</span><span class="dl">"</span><span class="p">:</span> <span class="dl">"</span><span class="s2">jscpd src --min-lines 10 --threshold 0</span><span class="dl">"</span><span class="p">,</span>
    <span class="dl">"</span><span class="s2">check:unused-code</span><span class="dl">"</span><span class="p">:</span> <span class="dl">"</span><span class="s2">knip</span><span class="dl">"</span><span class="p">,</span>
    <span class="dl">"</span><span class="s2">check:dependency-rules</span><span class="dl">"</span><span class="p">:</span> <span class="dl">"</span><span class="s2">depcruise src --config .dependency-cruiser.cjs</span><span class="dl">"</span><span class="p">,</span>
    <span class="dl">"</span><span class="s2">check:shared-usage</span><span class="dl">"</span><span class="p">:</span> <span class="dl">"</span><span class="s2">node scripts/check-shared-usage-depcruise.mjs</span><span class="dl">"</span><span class="p">,</span>
    <span class="dl">"</span><span class="s2">check:security</span><span class="dl">"</span><span class="p">:</span> <span class="dl">"</span><span class="s2">semgrep --config 'p/react' --config 'p/typescript' --include='*.tsx' --error</span><span class="dl">"</span><span class="p">,</span>
    <span class="dl">"</span><span class="s2">check</span><span class="dl">"</span><span class="p">:</span> <span class="dl">"</span><span class="s2">tsc --noEmit &amp;&amp; pnpm lint &amp;&amp; pnpm check:unused-code &amp;&amp; pnpm check:code-duplication &amp;&amp; pnpm check:dependency-rules &amp;&amp; pnpm check:shared-usage &amp;&amp; pnpm check:security</span><span class="dl">"</span><span class="p">,</span>
  <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p>And in my agent rules, I mandate that the agent always runs <code class="language-plaintext highlighter-rouge">pnpm check</code> before task completion.</p>

<p>Furthermore, I use <code class="language-plaintext highlighter-rouge">pnpm check</code> as a pre-commit hook so that even if the agent misses running the checks, at commit time, they are run and any violations get flagged.</p>

<h2 id="conclusion">Conclusion</h2>

<p>As with most posts related to agentic coding, most of this is anecdotal. However, the primary takeaway would be that the linting and static analysis tools available today are pretty great. And that even adding a few of these as guardrails for your agentic LLM assistants will go a long way to improving the resulting code quality.</p>

<p>Hope you have as much luck (if not more) with this as I did with the agentic LLM slot machines :)</p>]]></content><author><name>Balajee RamaChandran</name></author><summary type="html"><![CDATA[NOTE: This post, every bit of it, was entirely written by a human. Please feel free to blame the author rather than an LLM for any bits you suspect is poorly written / inane :)]]></summary></entry><entry><title type="html">The Joy Of Typed Python</title><link href="https://balajeerc.info/The-Joy-of-Typed-Python/" rel="alternate" type="text/html" title="The Joy Of Typed Python" /><published>2020-12-21T00:00:00+00:00</published><updated>2020-12-21T00:00:00+00:00</updated><id>https://balajeerc.info/The-Joy-of-Typed-Python</id><content type="html" xml:base="https://balajeerc.info/The-Joy-of-Typed-Python/"><![CDATA[<p>If I am to start working on a new project today, I would hesitate to attempt it in a language that does not have compile-time type checking. However, I do have to deal with Python at work (though we are slowly phasing it out). Also, I have been working off and on, in my spare time, on a Python project that has over the past 3+ years gotten fairly large as personal projects go. It started out as a one-off quick script. It eventually evolved into something larger that actually does something useful for me so I ended up adding to it and maintaining it.</p>

<p>Somewhere over a year and a half back, after being frustrated with my inability to refactor this code like I can with other type safe languages, I started exploring the possibility of adding type hints to the codebase. Now, after having spent the requisite time to understand the implications of type hinting, and whether it’s useful, and to be able to show this as a consolidation of my thoughts on the matter to friends and colleagues, I decided to write this post on what an <strong>absolute joy</strong> it has become to refactor and work with Python once you have type checking enforced.</p>

<p>Just to set expectations right: the following is more of gushing praise for <code class="language-plaintext highlighter-rouge">mypy</code> rather than <a href="https://mypy.readthedocs.io/en/stable/getting_started.html">a tutorial on how to use it</a>. If you are already using <code class="language-plaintext highlighter-rouge">mypy</code> on a regular basis, you are likely going to learn little from what follows. This is written with the hope of giving other friends of mine who are Python programmers an overview of starting with type annotations, if they aren’t already using them by detailing the nice things it provides. It also details some of the effort you will likely need to pay up front to make type-checking effective in your codebase, and the rough edges (yes, there are some) in the type annotation system that will likely cause you some annoyance.</p>

<p>Another expectation I’d like to calibrate is for people coming to this from a language with a sophisticated type system such as Rust or Haskell, or even more traditional type systems like Java and C++. These are things you take for granted and you will likely scoff at some of the things written here. However, I urge you to look at the benefits the following provide to the current state of Python programming that has no static type checking.</p>

<h2 id="overview-of-type-annotations-in-python">Overview of Type Annotations in Python</h2>

<p>First things first: Python, back from versions 3.5+, has already had support for type annotations. It is detailed in <a href="https://www.python.org/dev/peps/pep-0484">PEP 484</a>. This means that you can write functions that look like this:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">sum</span><span class="p">(</span><span class="n">a</span><span class="p">:</span> <span class="nb">int</span><span class="p">,</span> <span class="n">b</span><span class="p">:</span> <span class="nb">int</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="nb">int</span><span class="p">:</span>
    <span class="k">return</span> <span class="n">a</span> <span class="o">+</span> <span class="n">b</span>
</code></pre></div></div>

<p>However, the Python interpreter in no way enforces it, and will not enforce it in future. Python is expected to remain a dynamically typed language for the foreseeable future. The above function will run happily if called with <code class="language-plaintext highlighter-rouge">sum("hello", "world")</code> and return a concatenation of the two strings contrary to the intended use of the function. The Python interpreter simply ignores the type annotations.</p>

<p>However, you can use third party type checkers which will do static analysis of your code based on these type annotations and point out any type errors lurking in your code. The introduction of type annotations as part of the language syntax itself provides for much better aesthetics as compared to having to retrofit the type hints via code comments (a la Flow for pre ES6 JavaScript) and better ergonomics as compared to needing a transpiler to convert a new augmented language to valid vanilla code (like Typescript does).</p>

<p>One such type-checker is <code class="language-plaintext highlighter-rouge">mypy</code> - the earliest implementation of its kind that some of the Python core team are involved in, including the venerable Guido Van Rossum. However, note that<code class="language-plaintext highlighter-rouge">mypy</code> is a distinct from the Python interpreter and <a href="http://www.mypy-lang.org/">maintained as a separate project</a>. You will need to download and install it separately.</p>

<p>Running <code class="language-plaintext highlighter-rouge">mypy</code> on a file containing the aforementioned invocation of the <code class="language-plaintext highlighter-rouge">sum</code> function above with string arguments results in the following error from the type checker:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Argument 2 to "sum" has incompatible type "str"; expected "int"
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">mypy</code> is not the only type checker out there. There are other type checkers such as <code class="language-plaintext highlighter-rouge">Pyright</code> from Microsoft and <code class="language-plaintext highlighter-rouge">pyre</code> from Facebook. However, my experience with them is limited and the following notes are based on my experience with <code class="language-plaintext highlighter-rouge">mypy</code> exclusively.</p>

<h2 id="things-i-love">Things I love</h2>

<p>The trivial example that I stated above does not do justice to the sophistication that <code class="language-plaintext highlighter-rouge">mypy</code> brings. While you will need to grapple with installing, configuring it and running it on your code, in this section I’ll be detailing some of my favourite things it facilitates once you pay that price.</p>

<h3 id="optional-values-strictly-enforced">Optional Values, Strictly Enforced</h3>

<p>If there is a single thing that you can take away from this post, it ought to be the unreasonable effectiveness of strictly checked optional values. Just this one feature has had such a dramatic improvement on my code correctness that I cannot emphasise this enough.</p>

<p>Errors in software arising from not checking for nulls is <a href="https://www.infoq.com/presentations/Null-References-The-Billion-Dollar-Mistake-Tony-Hoare/">“a billion dollar problem”</a> and <code class="language-plaintext highlighter-rouge">mypy</code> is an invaluable guard against them. The primary type annotation (<em>type modifier</em> in <code class="language-plaintext highlighter-rouge">mypy</code> terms) that helps with this is the <a href="https://github.com/dry-python/returns#result-container"><code class="language-plaintext highlighter-rouge">Optional</code></a> annotation. In my opinion, explicitly handling and indicating where a value can be null fosters greater code correctness as  to approaches in languages like Java where any value can potentially be null.</p>

<p>Consider a canonical ‘customer record’ management scenario (yes, I’m unimaginative like that). Presumably, a simple customer type could be defined like so:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">class</span> <span class="nc">Customer</span><span class="p">:</span>
    <span class="n">name</span><span class="p">:</span> <span class="nb">str</span>
    <span class="n">address</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="nb">str</span><span class="p">]</span>
</code></pre></div></div>

<p>In this type, we encode the fact that we may sometime not have the customer’s address registered in our database, into the type itself. In vanilla Python without annotations, we would have just defined it as a string with the possibility that it may be <code class="language-plaintext highlighter-rouge">None</code> that we need to ‘remember’ any time we used it.</p>

<p>To demonstrate the utility of <code class="language-plaintext highlighter-rouge">Optional</code> type used in conjunction with <code class="language-plaintext highlighter-rouge">mypy</code>, let’s look at a (slightly contrived) example of an implementation checking how far a customer resides from our store. To simplify the example, we abstract away the details of looking up a <a href="https://developers.google.com/maps/documentation/javascript/examples/geocoding-simple">Geocoding service</a> and code to calculate distance between two geo-coordinates.</p>

<p>A first iteration of this code might look like so:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># A type alias for a (lat, long) tuple
</span><span class="n">Coords</span> <span class="o">=</span> <span class="n">Tuple</span><span class="p">[</span><span class="nb">float</span><span class="p">,</span> <span class="nb">float</span><span class="p">]</span>

<span class="k">def</span> <span class="nf">geo_lookup</span><span class="p">(</span><span class="n">input_address</span><span class="p">:</span> <span class="nb">str</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="n">Optional</span><span class="p">[</span><span class="n">Coords</span><span class="p">]:</span>
    <span class="c1"># ... snip ...
</span>    <span class="c1"># Geocoding API lookup
</span>    <span class="c1"># Hardcoding return for now
</span>    <span class="k">return</span> <span class="p">(</span><span class="mf">12.972442</span><span class="p">,</span> <span class="mf">77.580643</span><span class="p">)</span>

<span class="k">def</span> <span class="nf">calculate_distance_in_miles</span><span class="p">(</span><span class="n">coord1</span><span class="p">:</span> <span class="n">Coords</span><span class="p">,</span> <span class="n">coord2</span><span class="p">:</span> <span class="n">Coords</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="nb">float</span><span class="p">:</span>
    <span class="c1"># ... snip ...
</span>    <span class="c1"># Distance computation
</span>    <span class="c1"># Hardcoding return for now
</span>    <span class="k">return</span> <span class="mf">3.5</span>

<span class="c1"># WARNING: Buggy code below
</span><span class="k">def</span> <span class="nf">get_delivery_distance</span><span class="p">(</span><span class="n">customer</span><span class="p">:</span> <span class="n">Customer</span><span class="p">,</span> <span class="n">store_coords</span><span class="p">:</span> <span class="n">Coords</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="nb">float</span><span class="p">:</span>
    <span class="n">geo_coords</span><span class="p">:</span> <span class="n">Coords</span> <span class="o">=</span> <span class="n">geo_lookup</span><span class="p">(</span><span class="n">customer</span><span class="p">.</span><span class="n">address</span><span class="p">)</span> <span class="c1"># Errors here
</span>    <span class="k">return</span> <span class="n">calculate_distance_in_miles</span><span class="p">(</span><span class="n">geo_coords</span><span class="p">,</span> <span class="n">store_coords</span><span class="p">)</span>
</code></pre></div></div>

<p>Running this through <code class="language-plaintext highlighter-rouge">mypy</code> generates the following errors:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>geo.py:18: error: Incompatible types in assignment (expression has type "Optional[Tuple[float, float]]", variable has type "Tuple[float, float]")
geo.py:18: error: Argument 1 to "geo_lookup" has incompatible type "Optional[str]"; expected "str"
</code></pre></div></div>

<p>As you can imagine, this is quite a lifesaver in more real world conditions. <code class="language-plaintext highlighter-rouge">mypy</code> just caught the error of using <code class="language-plaintext highlighter-rouge">customer.address</code> without checking if it is <code class="language-plaintext highlighter-rouge">None</code>. It also caught the bug where we are trying to assign the <code class="language-plaintext highlighter-rouge">geo_coords</code> value to the return from <code class="language-plaintext highlighter-rouge">geo_lookup</code> without accounting for the fact that it too can return <code class="language-plaintext highlighter-rouge">None</code>.</p>

<p>The corrected version of the <code class="language-plaintext highlighter-rouge">get_delivery_distance</code> would look like so:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">get_delivery_distance</span><span class="p">(</span><span class="n">customer</span><span class="p">:</span> <span class="n">Customer</span><span class="p">,</span> <span class="n">store_coords</span><span class="p">:</span> <span class="n">Coords</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="n">Optional</span><span class="p">[</span><span class="nb">float</span><span class="p">]:</span>
    <span class="k">if</span> <span class="n">customer</span><span class="p">.</span><span class="n">address</span> <span class="ow">is</span> <span class="bp">None</span><span class="p">:</span>
        <span class="k">return</span> <span class="bp">None</span>
    <span class="n">geo_coords</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="n">Coords</span><span class="p">]</span> <span class="o">=</span> <span class="n">geo_lookup</span><span class="p">(</span><span class="n">customer</span><span class="p">.</span><span class="n">address</span><span class="p">)</span>
    <span class="k">if</span> <span class="n">geo_coords</span><span class="p">:</span>
        <span class="k">return</span> <span class="n">calculate_distance_in_miles</span><span class="p">(</span><span class="n">geo_coords</span><span class="p">,</span> <span class="n">store_coords</span><span class="p">)</span>
    <span class="k">return</span> <span class="bp">None</span>
</code></pre></div></div>

<p>In the above code, we take change the return type of <code class="language-plaintext highlighter-rouge">get_delivery_address</code> to account for the possibility that it may be <code class="language-plaintext highlighter-rouge">None</code>. Furthermore, we expressly handle both the possibility of <code class="language-plaintext highlighter-rouge">customer.address</code> being <code class="language-plaintext highlighter-rouge">None</code> as well as the return from the geocoding lookup service being <code class="language-plaintext highlighter-rouge">None</code>. As you see, <code class="language-plaintext highlighter-rouge">mypy</code> cleverly infers that you have indeed handled the <code class="language-plaintext highlighter-rouge">None</code> cases appropriately and hence does not throw errors in the new code.</p>

<p>As you can observe, this enforcement of checking for optional values cascades through your codebase and <code class="language-plaintext highlighter-rouge">mypy</code> is always looking over your shoulders to warn you against potential places where you have not handled them. Furthermore, it’s very cool that <code class="language-plaintext highlighter-rouge">mypy</code> manages to do a significant amount of type inference based on the conditional blocks, looking for checks for <code class="language-plaintext highlighter-rouge">None</code>. Contrast this with languages like C++ where, in code written with the equivalent  <code class="language-plaintext highlighter-rouge">std::optional</code> calls to <code class="language-plaintext highlighter-rouge">std::optional::value()</code> method will happily compile, and fail at runtime throwing a <code class="language-plaintext highlighter-rouge">bad_optional_access</code> exception if in fact it holds only a null value.</p>

<p>If you are a Rust or Haskell programmer, you are at this point cringing at all the conditional checks, and missing your <code class="language-plaintext highlighter-rouge">Either</code> s and <code class="language-plaintext highlighter-rouge">Maybe</code>s. There are some <a href="https://github.com/dry-python/returns#result-container">third party libraries</a> that (sorta) provide these for you though I have not used these in my code.</p>

<h3 id="generics">Generics</h3>

<p>The irony of lauding the support for generics in the type annotations of a language that is intrinsically dynamically typed is not lost on me. Nevertheless, as soon as you enter the world that is typed Python even if it’s make believe, you start needing generics. Thankfully, the support for it is quite nice.</p>

<p>Here is a snippet plagiarised entirely from the <code class="language-plaintext highlighter-rouge">mypy</code> <a href="https://mypy.readthedocs.io/en/stable/generics.html#defining-generic-classes">documentation</a>:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">from</span> <span class="nn">typing</span> <span class="kn">import</span> <span class="n">TypeVar</span><span class="p">,</span> <span class="n">Generic</span>

<span class="n">T</span> <span class="o">=</span> <span class="n">TypeVar</span><span class="p">(</span><span class="s">'T'</span><span class="p">)</span>

<span class="k">class</span> <span class="nc">Stack</span><span class="p">(</span><span class="n">Generic</span><span class="p">[</span><span class="n">T</span><span class="p">]):</span>
    <span class="k">def</span> <span class="nf">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="bp">None</span><span class="p">:</span>
        <span class="c1"># Create an empty list with items of type T
</span>        <span class="bp">self</span><span class="p">.</span><span class="n">items</span><span class="p">:</span> <span class="n">List</span><span class="p">[</span><span class="n">T</span><span class="p">]</span> <span class="o">=</span> <span class="p">[]</span>

    <span class="k">def</span> <span class="nf">push</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">item</span><span class="p">:</span> <span class="n">T</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="bp">None</span><span class="p">:</span>
        <span class="bp">self</span><span class="p">.</span><span class="n">items</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">item</span><span class="p">)</span>

    <span class="k">def</span> <span class="nf">pop</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="n">T</span><span class="p">:</span>
        <span class="k">return</span> <span class="bp">self</span><span class="p">.</span><span class="n">items</span><span class="p">.</span><span class="n">pop</span><span class="p">()</span>

    <span class="k">def</span> <span class="nf">empty</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="nb">bool</span><span class="p">:</span>
        <span class="k">return</span> <span class="ow">not</span> <span class="bp">self</span><span class="p">.</span><span class="n">items</span>
</code></pre></div></div>

<p>The generic <code class="language-plaintext highlighter-rouge">T</code> in the snippet can be substituted with any type. However, in case you want to restrict <code class="language-plaintext highlighter-rouge">T</code> to be one of a limited set of types, that is possible too:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">from</span> <span class="nn">typing</span> <span class="kn">import</span> <span class="n">TypeVar</span>

<span class="n">Numeric</span> <span class="o">=</span> <span class="n">TypeVar</span><span class="p">(</span><span class="s">'Numeric'</span><span class="p">,</span> <span class="nb">int</span><span class="p">,</span> <span class="nb">float</span><span class="p">)</span>
</code></pre></div></div>

<h3 id="interfaces">Interfaces</h3>

<p>Classic interface implementation as is standard in object oriented Python, using  <code class="language-plaintext highlighter-rouge">ABC</code> module, <a href="https://mypy.readthedocs.io/en/stable/class_basics.html#abstract-base-classes-and-multiple-inheritance">just works</a>, and it generally does not excite me much since most code I prefer to write myself is procedural.</p>

<p>However, thanks to the fact that Python supports static inheritance, i.e. you can define a <code class="language-plaintext highlighter-rouge">@staticmethod</code> as <code class="language-plaintext highlighter-rouge">@abstract</code> you can create contracts that classes can implement. This might sound a bit strange to people coming from Java where static inheritance is absent altogether.</p>

<p>Consider the following example:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">from</span> <span class="nn">typing</span> <span class="kn">import</span> <span class="n">List</span><span class="p">,</span> <span class="n">Callable</span>
<span class="kn">from</span> <span class="nn">abc</span> <span class="kn">import</span> <span class="n">ABC</span><span class="p">,</span> <span class="n">abstractmethod</span>


<span class="k">class</span> <span class="nc">TreeNode</span><span class="p">(</span><span class="n">ABC</span><span class="p">):</span>
    <span class="s">"""Generic tree node"""</span>

    <span class="o">@</span><span class="n">abstractmethod</span>
    <span class="k">def</span> <span class="nf">parent</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="n">TreeNode</span><span class="p">:</span>
        <span class="s">"""Retrieve parent node"""</span>

    <span class="o">@</span><span class="n">abstractmethod</span>
    <span class="k">def</span> <span class="nf">children</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="n">List</span><span class="p">[</span><span class="n">TreeNode</span><span class="p">]:</span>
        <span class="s">"""Retrieve children"""</span>


<span class="k">class</span> <span class="nc">TreeWalk</span><span class="p">(</span><span class="n">ABC</span><span class="p">):</span>
    <span class="s">"""Collection of functions to walk/search a tree"""</span>

    <span class="o">@</span><span class="nb">classmethod</span>
    <span class="o">@</span><span class="n">abstractmethod</span>
    <span class="k">def</span> <span class="nf">name</span><span class="p">(</span><span class="n">cls</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="nb">str</span><span class="p">:</span>
        <span class="s">"""Name to use in registry"""</span>

    <span class="o">@</span><span class="nb">classmethod</span>
    <span class="o">@</span><span class="n">abstractmethod</span>
    <span class="k">def</span> <span class="nf">walk</span><span class="p">(</span><span class="n">cls</span><span class="p">,</span> <span class="n">node</span><span class="p">:</span> <span class="n">TreeNode</span><span class="p">,</span> <span class="n">callback</span><span class="p">:</span> <span class="n">Callable</span><span class="p">[[</span><span class="n">TreeNode</span><span class="p">],</span> <span class="nb">bool</span><span class="p">])</span> <span class="o">-&gt;</span> <span class="bp">None</span><span class="p">:</span>
        <span class="s">"""
        Walk function that walks the tree node by node
        and invokes `callback` with each node as argument
        the first time it encounters it.
        If callback returns False, the walk is terminated.
        """</span>

    <span class="o">@</span><span class="nb">classmethod</span>
    <span class="o">@</span><span class="n">abstractmethod</span>
    <span class="k">def</span> <span class="nf">last</span><span class="p">(</span><span class="n">cls</span><span class="p">,</span> <span class="n">root</span><span class="p">:</span> <span class="n">TreeNode</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="n">TreeNode</span><span class="p">:</span>
        <span class="s">"""
        Returns last node in the tree from the walk process
        """</span>


<span class="k">class</span> <span class="nc">DepthFirstWalk</span><span class="p">(</span><span class="n">TreeWalk</span><span class="p">):</span>
    <span class="s">"""Depth first search"""</span>

    <span class="o">@</span><span class="nb">classmethod</span>
    <span class="k">def</span> <span class="nf">name</span><span class="p">(</span><span class="n">cls</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="nb">str</span><span class="p">:</span>
        <span class="k">return</span> <span class="s">"DEPTH_FIRST"</span>

    <span class="o">@</span><span class="nb">classmethod</span>
    <span class="k">def</span> <span class="nf">walk</span><span class="p">(</span><span class="n">cls</span><span class="p">,</span> <span class="n">node</span><span class="p">:</span> <span class="n">TreeNode</span><span class="p">,</span> <span class="n">callback</span><span class="p">:</span> <span class="n">Callable</span><span class="p">[[</span><span class="n">TreeNode</span><span class="p">],</span> <span class="nb">bool</span><span class="p">])</span> <span class="o">-&gt;</span> <span class="bp">None</span><span class="p">:</span>
        <span class="c1"># -- snip --
</span>        <span class="c1"># Implementation of walk
</span>        <span class="k">pass</span>

    <span class="o">@</span><span class="nb">classmethod</span>
    <span class="k">def</span> <span class="nf">last</span><span class="p">(</span><span class="n">cls</span><span class="p">,</span> <span class="n">root</span><span class="p">:</span> <span class="n">TreeNode</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="n">TreeNode</span><span class="p">:</span>
        <span class="c1"># -- snip --
</span>        <span class="c1"># Implementation of last
</span>        <span class="k">pass</span>


<span class="k">class</span> <span class="nc">BreadthFirstWalk</span><span class="p">(</span><span class="n">TreeWalk</span><span class="p">):</span>
    <span class="s">"""Breadth first walk"""</span>

    <span class="c1">## -- snip --
</span></code></pre></div></div>

<p>I prefer organising code as shown above rather than an equivalent implementation of <code class="language-plaintext highlighter-rouge">TreeWalk(er)</code> with <code class="language-plaintext highlighter-rouge">walk</code> and <code class="language-plaintext highlighter-rouge">last</code> as instance methods in line with traditional <code class="language-plaintext highlighter-rouge">Visitor</code> pattern implementations.</p>

<p>My justification for this is that the above code fosters the use of pure functions, and eliminates the possibility of any <code class="language-plaintext highlighter-rouge">TreeWalk</code> implementation creating and retaining local state. Effectively, each <code class="language-plaintext highlighter-rouge">TreeWalk</code> implementation becomes a collection of pure functions.</p>

<h3 id="types-themselves-as-first-class-citizens"><em>Types</em> themselves as First Class Citizens</h3>

<p>What  is cool is that the annotation system lets you take a class type satisfying an <code class="language-plaintext highlighter-rouge">ABC</code> as argument rather than just an instance. As an illustration, I can now use this in a <code class="language-plaintext highlighter-rouge">WalkRegistry</code> as shown below:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">class</span> <span class="nc">WalkRegistry</span><span class="p">:</span>
    <span class="s">"""Registry of tree walkers"""</span>

    <span class="n">_register</span><span class="p">:</span> <span class="n">Dict</span><span class="p">[</span><span class="nb">str</span><span class="p">,</span> <span class="n">Type</span><span class="p">[</span><span class="n">TreeWalk</span><span class="p">]]</span> <span class="o">=</span> <span class="p">{}</span>

    <span class="o">@</span><span class="nb">classmethod</span>
    <span class="k">def</span> <span class="nf">register</span><span class="p">(</span><span class="n">cls</span><span class="p">,</span> <span class="n">walk</span><span class="p">:</span> <span class="n">Type</span><span class="p">[</span><span class="n">TreeWalk</span><span class="p">])</span> <span class="o">-&gt;</span> <span class="bp">None</span><span class="p">:</span>
        <span class="s">"""Registers a walk"""</span>
        <span class="k">assert</span> <span class="p">(</span>
            <span class="n">walk</span><span class="p">.</span><span class="n">name</span><span class="p">()</span> <span class="ow">not</span> <span class="ow">in</span> <span class="n">cls</span><span class="p">.</span><span class="n">_register</span>
        <span class="p">),</span> <span class="s">"Another walk registered with this name"</span>
        <span class="n">cls</span><span class="p">.</span><span class="n">_register</span><span class="p">[</span><span class="n">walk</span><span class="p">.</span><span class="n">name</span><span class="p">()]</span> <span class="o">=</span> <span class="n">walk</span>

    <span class="o">@</span><span class="nb">classmethod</span>
    <span class="k">def</span> <span class="nf">retrieve</span><span class="p">(</span><span class="n">cls</span><span class="p">,</span> <span class="n">name</span><span class="p">:</span> <span class="nb">str</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="n">Optional</span><span class="p">[</span><span class="n">Type</span><span class="p">[</span><span class="n">TreeWalk</span><span class="p">]]:</span>
        <span class="s">"""Retrieves a walk by name"""</span>
        <span class="k">return</span> <span class="n">cls</span><span class="p">.</span><span class="n">_register</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="n">name</span><span class="p">,</span> <span class="bp">None</span><span class="p">)</span>

<span class="c1"># Usage
</span><span class="n">WalkRegistry</span><span class="p">.</span><span class="n">register</span><span class="p">(</span><span class="n">DepthFirstWalk</span><span class="p">)</span>
<span class="k">assert</span> <span class="n">WalkRegistry</span><span class="p">.</span><span class="n">retrieve</span><span class="p">(</span><span class="s">"DEPTH_FIRST"</span><span class="p">)</span> <span class="ow">is</span> <span class="ow">not</span> <span class="bp">None</span>
</code></pre></div></div>

<p>Note here that the argument <code class="language-plaintext highlighter-rouge">walk</code> to the <code class="language-plaintext highlighter-rouge">WalkRegistry.register</code> classmethod is a <em>type</em> and not a concrete instance. That type input is then stored in a dictionary (``WalkRegistry._register` in the above code). In other words, types themselves are first class citizens of the annotation system.</p>

<p>If you are from a Java background, you might be able to use the catch-all <code class="language-plaintext highlighter-rouge">Class</code> type to pass a type as argument but I am as of this writing unaware of a way to constrain that type to a particular interface as we do in this code (<code class="language-plaintext highlighter-rouge">TreeWalk</code>). In C++, as far as I know, this sort of code where you store a <em>type</em> in <code class="language-plaintext highlighter-rouge">std::map</code> is not possible at all.</p>

<p>I suppose this is one of the few unintended benefits of mixing a dynamically typed language with static type checking.</p>

<h2 id="rough-edges-things-that-could-be-better">Rough Edges (Things that could be better)</h2>

<p>As you can imagine, there are some rough edges you are likely going to encounter with this retrofitting of type safety into what is essentially a dynamically typed language.</p>

<h3 id="typeshed-and-external-libraries">Typeshed and External libraries</h3>

<p>The first and most obvious annoyance you are likely to run into is that not all existing third party libraries will have support for type annotations already. There exists an ongoing effort called <a href="https://github.com/python/typeshed/"><code class="language-plaintext highlighter-rouge">typeshed</code></a> that acts a collection for type hint ‘stubs’ for major projects and is bundled with <code class="language-plaintext highlighter-rouge">mypy</code>. You should be able to find some of the popular libraries there (like Flask). Some of the other big libraries like <code class="language-plaintext highlighter-rouge">numpy</code> host the type stubs as part of the core project repository. Type stubs for other big libraries can be found maintained <a href="https://github.com/TypedDjango/django-stubs">separately</a>.  Then again, you might be using a library whose primary author is <a href="https://github.com/coleifer/peewee/issues/1298">principally opposed</a> to type-annotations. So it’s a mixed bag, really.</p>

<p>However, in case you are daunted by the prospect of relying on a library that does not have type stubs, the key thing to not throw the baby out with the bathwater and discard the prospect of type annotations altogether. Instead, the key point to note is that you will typically have some wrapper code around the external library you are going to use. While the code within the functions in that wrapper themselves may not be type checked, liberally using <a href="https://mypy.readthedocs.io/en/stable/kinds_of_types.html#the-any-type">the Any type</a>, it is still possible to make sure that the rest of your code has type integrity. The sanity that offers is well worth the ugly unchecked API boundaries/library wrappers.</p>

<h3 id="recursive-types">Recursive Types</h3>

<p>You are going to run into some issues with code where types reference themselves as part of their definition.</p>

<p>So for example, the following code <a href="https://github.com/python/mypy/issues/731">will not type check correctly (yet)</a>:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">Callback</span> <span class="o">=</span> <span class="n">Callable</span><span class="p">[[</span><span class="nb">str</span><span class="p">],</span> <span class="s">"Callback"</span><span class="p">]</span>
<span class="n">Foo</span> <span class="o">=</span> <span class="n">Union</span><span class="p">[</span><span class="nb">str</span><span class="p">,</span> <span class="n">List</span><span class="p">[</span><span class="s">"Foo"</span><span class="p">]]</span>
</code></pre></div></div>

<p>Presented with the above code, <code class="language-plaintext highlighter-rouge">mypy</code> shows the following error:</p>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>error: Recursive types not fully supported yet, nested types replaced with "Any"
</code></pre></div></div>

<p>However, this code does indeed type check without issues:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">class</span> <span class="nc">LinkedListNode</span><span class="p">:</span>
    <span class="k">def</span> <span class="nf">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="bp">None</span><span class="p">:</span>
        <span class="bp">self</span><span class="p">.</span><span class="n">next_node</span><span class="p">:</span> <span class="n">LinkedListNode</span>

    <span class="k">def</span> <span class="nf">next</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="n">LinkedListNode</span><span class="p">:</span>
        <span class="k">return</span> <span class="bp">self</span><span class="p">.</span><span class="n">next_node</span>
</code></pre></div></div>

<h3 id="errors-dont-reference-the-type-aliases">Errors Don’t Reference the Type Aliases</h3>

<p>Consider the code below with an obvious type error (possibly occurring during a refactor):</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">Point</span> <span class="o">=</span> <span class="n">Tuple</span><span class="p">[</span><span class="nb">float</span><span class="p">,</span> <span class="nb">float</span><span class="p">]</span>
<span class="n">Line</span> <span class="o">=</span> <span class="n">Tuple</span><span class="p">[</span><span class="n">Point</span><span class="p">,</span> <span class="n">Point</span><span class="p">]</span>

<span class="k">def</span> <span class="nf">calculate_length</span><span class="p">(</span><span class="n">line</span><span class="p">:</span> <span class="n">Line</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="nb">float</span><span class="p">:</span>
    <span class="n">pt1</span><span class="p">,</span> <span class="n">pt2</span> <span class="o">=</span> <span class="n">line</span>
    <span class="c1"># ERROR: Referencing invalid members below
</span>    <span class="k">return</span> <span class="n">math</span><span class="p">.</span><span class="n">sqrt</span><span class="p">(</span><span class="nb">pow</span><span class="p">(</span><span class="n">pt2</span><span class="p">.</span><span class="n">x</span> <span class="o">-</span> <span class="n">pt1</span><span class="p">.</span><span class="n">x</span><span class="p">,</span> <span class="mi">2</span><span class="p">)</span> <span class="o">+</span> <span class="nb">pow</span><span class="p">(</span><span class="n">pt2</span><span class="p">.</span><span class="n">y</span> <span class="o">-</span> <span class="n">pt1</span><span class="p">.</span><span class="n">y</span><span class="p">,</span> <span class="mi">2</span><span class="p">))</span>
</code></pre></div></div>

<p>This results in the following error from <code class="language-plaintext highlighter-rouge">mypy</code></p>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>error: "Tuple[float, float]" has no attribute "x"
error: "Tuple[float, float]" has no attribute "y"
</code></pre></div></div>

<p>Would have been nicer if the errors actually read:</p>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>error: "Point" has no attribute "x"
error: "Point" has no attribute "y"
</code></pre></div></div>

<p>This might be a bit of a pet peeve and not that big an issue to most people.  However, I use nested aliases significantly (as in the above code, <code class="language-plaintext highlighter-rouge">Point</code> being an alias for a tuple of floats, and <code class="language-plaintext highlighter-rouge">Line</code> being an alias for a tuple of <code class="language-plaintext highlighter-rouge">Point</code>) and with enough nesting the errors become a bit hard to read. For example, in the above example, were we to (incorrectly) call <code class="language-plaintext highlighter-rouge">calculate_length("hi")</code>, the error <code class="language-plaintext highlighter-rouge">mypy</code> throws is:</p>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>error: Argument 1 to "calculate_length" has incompatible type "str"; expected "Tuple[Tuple[float, float], Tuple[float, float]]"
</code></pre></div></div>

<h3 id="serialisationdeserialisation">Serialisation/Deserialisation</h3>

<p>This is again akin to the third party code boundary issue that you can’t really enforce type safety in as nicely as you would like. For starters, there isn’t really a JSON type annotation available because that <a href="https://github.com/python/mypy/issues/731#issuecomment-317401621">would again need recursive type support</a>.</p>

<p>Furthermore, there really isn’t anything as nice as <code class="language-plaintext highlighter-rouge">serde</code> in Rust or <code class="language-plaintext highlighter-rouge">marshal/unmarshal</code> in Golang that is available here. You will need to rely on traditional Python serialisation and deserialisation facilities which work well enough, but still leaves you with the task of manually rolling custom encoders and decoders per serialisation format.</p>

<p>I have also had some success using a third party library like <a href="https://github.com/ltworf/typedload"><code class="language-plaintext highlighter-rouge">typedload</code></a> to manage serialisation of simple types to and from <code class="language-plaintext highlighter-rouge">dict</code> , and from there onward to JSON.</p>

<h2 id="prerequisite-effort--setting--things-up">Prerequisite Effort / Setting  Things Up</h2>

<p><code class="language-plaintext highlighter-rouge">mypy</code> already has <a href="https://mypy.readthedocs.io/en/stable/existing_code.html">helpful documentation</a> on how to incrementally introduce type hints into your existing codebase. So again, rather than providing a HOW-TO, I am going to list some the effort I had to expend to get type annotations play nice with my existing codebase.</p>

<h3 id="preliminary-refactoring---dicts-to-classes-and-namedtuples">Preliminary Refactoring - Dicts to Classes and NamedTuples</h3>

<p>If your code relied on abstract data types like <code class="language-plaintext highlighter-rouge">dict</code> and <code class="language-plaintext highlighter-rouge">list</code> exclusively, and sparingly used classes and <code class="language-plaintext highlighter-rouge">namedtuple</code>s, you are going to have your work cut out for you in adopting type hints. Having function types all be an inscrutable <code class="language-plaintext highlighter-rouge">Dict[str, Any]</code> is not going to give you the payoffs as having concrete classes. Fortunately, this kind of over-reliance on abstract data types (ADTs) vs custom types is less prevalent in Python code bases as compared to functionally written JavaScript code.</p>

<p>Converting a <code class="language-plaintext highlighter-rouge">Dict</code> to a class is no small matter and will likely see cascading changes across the codebase. This is likely going to be the most painful part of “typing” your existing Python codebase. In my case, the rewards were well worth the pain and once I had <code class="language-plaintext highlighter-rouge">mypy</code> run on the newly refactored code, I had lot more confidence in my code correctness.</p>

<h4 id="fighting--any-urges">Fighting  <code class="language-plaintext highlighter-rouge">Any</code> Urges</h4>

<p>As you refactor, a good rule of thumb on when to switch a <code class="language-plaintext highlighter-rouge">dict</code> into a <code class="language-plaintext highlighter-rouge">class</code> is when you start seeming to need the <code class="language-plaintext highlighter-rouge">Dict[str, Any]</code> annotation. Except in wrapper functions dealing with third party library code, a profusion of <code class="language-plaintext highlighter-rouge">Any</code> is a sure sign that you are doing typing sub-optimally.</p>

<p>Once you do the first type checking of your codebase, then you must strive thereafter to eschew the introduction of new uses of <code class="language-plaintext highlighter-rouge">Any</code>. I say strive, because the use of <code class="language-plaintext highlighter-rouge">Any</code> always provide a quick and dirty kludge out of a trick type resolution problem. However, the more you go down that route, the less bang for buck you’re going to get from the type annotations.</p>

<h3 id="generating-type-annotations">Generating Type Annotations</h3>

<p>While I hand rolled all my type annotations, there are projects like <a href="https://pypi.org/project/MonkeyType/"><code class="language-plaintext highlighter-rouge">MonkeyType</code></a> which help generating type annotations by inspecting the type data collected at runtime. I reckon that even using such a project, you will need to do a once-over due diligence on your code to make sure the types generated are correct.</p>

<h3 id="mypy-configuration"><code class="language-plaintext highlighter-rouge">mypy</code> Configuration</h3>

<p>I spent a little time needing to tinker with the <a href="https://mypy.readthedocs.io/en/stable/config_file.html">configuration file: <code class="language-plaintext highlighter-rouge">mypy.ini</code></a> setup correctly so as to tweak its default behaviour to suit my project. For eg. the <code class="language-plaintext highlighter-rouge">mypy.ini</code> I settled on looked like so:</p>

<div class="language-ini highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nn">[mypy]</span>
<span class="py">python_version</span> <span class="p">=</span> <span class="s">3.6</span>
<span class="py">warn_return_any</span> <span class="p">=</span> <span class="s">True</span>
<span class="py">warn_unused_configs</span> <span class="p">=</span> <span class="s">True</span>
<span class="py">ignore_missing_imports</span> <span class="p">=</span> <span class="s">True</span>
<span class="py">disallow_untyped_defs</span> <span class="p">=</span> <span class="s">True</span>
<span class="py">disallow_incomplete_defs</span> <span class="p">=</span> <span class="s">True</span>
<span class="py">check_untyped_defs</span> <span class="p">=</span> <span class="s">True</span>
<span class="py">mypy_path</span> <span class="p">=</span> <span class="s">"myproject"</span>

<span class="c"># Per-module options:
</span><span class="nn">[mypy-wrappers.*]</span>
<span class="py">ignore_missing_imports</span> <span class="p">=</span> <span class="s">False</span>
</code></pre></div></div>

<p>Some salient points in the above configuration are:</p>

<ul>
  <li>I opt in for a more strict type checking for code <em>within</em> my project. <code class="language-plaintext highlighter-rouge">disallow_untyped_defs</code> and <code class="language-plaintext highlighter-rouge">disallow_incomplete_defs</code> are invaluable here because they notify you of functions where you may have forgotten to annotate a variable or missed to specify the return type. Those things definitely take a while to become second nature when you transitioning from vanilla Python to typed Python.</li>
  <li>
    <p>The <code class="language-plaintext highlighter-rouge">ignore_missing_imports</code> in the per module configuration permits ignoring the lack of type hints of any third party modules imported in the <code class="language-plaintext highlighter-rouge">wrappers</code> module in my code, housing wrapper code around third party libraries. So assuming my code was organised like:</p>

    <div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>myproject
  - core_stuff
  - tests
  - wrappers
</code></pre></div>    </div>

    <p>Then that per module configuration indicates to <code class="language-plaintext highlighter-rouge">mypy</code> that any imports missing type annotations inside the <code class="language-plaintext highlighter-rouge">wrappers</code> module alone. However, I want <code class="language-plaintext highlighter-rouge">mypy</code> to NOT ignore missing type annotations for imports used within <code class="language-plaintext highlighter-rouge">core_stuff</code> and <code class="language-plaintext highlighter-rouge">tests</code>.</p>
  </li>
</ul>

<p>Customising per module configurations is a great way to incrementally introduce typing to an existing codebase. You could start off by enforcing type checks on just one core module and leave others out entirely.</p>

<h3 id="editor-integration-setup">Editor Integration Setup</h3>

<p>Running <code class="language-plaintext highlighter-rouge">mypy</code> on the command-line manually to check for types as you keep writing code gets tedious quickly. Editor integration for <code class="language-plaintext highlighter-rouge">mypy</code> is fortunately easy to find and is available for most of the popular editors and IDEs. For VSCode, the Python plugin ships with <code class="language-plaintext highlighter-rouge">mypy</code> support out of the box. In <code class="language-plaintext highlighter-rouge">nvim</code>, I have <code class="language-plaintext highlighter-rouge">mypy</code> setup as a Python linter run by the <a href="https://github.com/dense-analysis/ale">ALE</a> plugin. I hence get type errors highlighted for me as soon as I save the file. This quick feedback makes working in a typed Python codebase quite a joy. With proper editor integration setup, the experience coding in Python comes close to having a compiler doing this for you, that you may have come to rely on in other languages.</p>

<p><img src="/images/posts/vim_mypy_screenshot.png" alt="Screenshot of mypy errors in vim" /></p>

<p>One point to note in this regard is that I typically also have <code class="language-plaintext highlighter-rouge">pylint</code> running linting checks on my code <em>alongside</em> <code class="language-plaintext highlighter-rouge">mypy</code>. This helps catch a lot of other errors which are not really type errors but helps catch bugs and improve code quality nevertheless.</p>

<h3 id="pre-commit-hook">Pre-Commit Hook</h3>

<p>While having editor integration of <code class="language-plaintext highlighter-rouge">mypy</code> running checks on your code is great and gives quick feedback, the final source of truth is the code that gets committed. It is possible to commit the odd file in which the type correctness broke because of changes you made in an entirely different module. This will presumably be a non-issue in a company where you probably already have elaborate CI/CD pipelines to do such enforcement for you. However, my 1-man personal projects, I absolutely find it vital to run linting checks before every commit and reject those that fail those checks.</p>

<p>I use the excellent <a href="https://pre-commit.com/">pre-commit</a> library for this. I run both <code class="language-plaintext highlighter-rouge">mypy</code> and <code class="language-plaintext highlighter-rouge">pylint</code> checks in the pre-commit hooks to ensure type integrity across the project. While <code class="language-plaintext highlighter-rouge">pre-commit</code> library lets you run the linters on just the files containing changes in your current commit I typically run <code class="language-plaintext highlighter-rouge">mypy</code> on the entire project. However, in very large projects, your mileage may vary, and may be saddled with longer <code class="language-plaintext highlighter-rouge">pre-commit</code> lint times.</p>

<h2 id="final-thoughts">Final Thoughts</h2>

<p>Programming with the <code class="language-plaintext highlighter-rouge">mypy</code> type-checker looking over your shoulder is such a dramatic improvement from doing without it, that I find it almost crippling when I have to read and modify old Python code at work that does not have type hints.</p>

<p>Small things like <code class="language-plaintext highlighter-rouge">mypy</code> pointing out that you have not handled the null return case (or more accurately, <code class="language-plaintext highlighter-rouge">None</code> return) correctly has saved me from what would have been hard to debug errors several times now.</p>

<p>To be clear, I would still not advocate writing a <em>new</em> service in Python. You would be much better off picking Golang, Rust or good ‘ol Java for that simply because of the performance advantages that these languages/runtimes offer over Python with their stricter type-safety guarantees. However, if you have an existing codebase already in Python that you would like to bring some sanity to, you could do worse than introducing type-hinting and slowly refactoring it.</p>

<p>I long for the day when someone writes for Python, an equivalent of what the Crystal (programming language) is for Ruby - a mandatory type-safe and performant successor. And when they do, PEP484 will alleviate the need to invent a new syntax.</p>]]></content><author><name>Balajee RamaChandran</name></author><summary type="html"><![CDATA[If I am to start working on a new project today, I would hesitate to attempt it in a language that does not have compile-time type checking. However, I do have to deal with Python at work (though we are slowly phasing it out). Also, I have been working off and on, in my spare time, on a Python project that has over the past 3+ years gotten fairly large as personal projects go. It started out as a one-off quick script. It eventually evolved into something larger that actually does something useful for me so I ended up adding to it and maintaining it.]]></summary></entry><entry><title type="html">Debuggers</title><link href="https://balajeerc.info/Debuggers/" rel="alternate" type="text/html" title="Debuggers" /><published>2020-09-12T00:00:00+00:00</published><updated>2020-09-12T00:00:00+00:00</updated><id>https://balajeerc.info/Debuggers</id><content type="html" xml:base="https://balajeerc.info/Debuggers/"><![CDATA[<p>My thought on use of debuggers is something of an evolving opinion.</p>

<p>The first time I discovered a debugger was when a senior in college saw me, in the college’s computer lab, writing a program littered with <code class="language-plaintext highlighter-rouge">printf</code> statements.</p>

<p>He politely interrupted me and pointed out that I could use this thing called <code class="language-plaintext highlighter-rouge">gdb</code> instead and gave me a quick intro to the tool. At the time, it struck me as the bee’s knees and started using it extensively.</p>

<p>(As an aside: several years later, I’d go on to co-found <a href="https://sensibull.com/" title="Sensibull">a company</a> with <a href="https://in.linkedin.com/in/abidhassan">that senior</a>. As luck would have it, he moved away from computing after college but probably had not too small a role in setting me off on my career in software development.)</p>

<p>Ever since I discovered debuggers and learnt to use them with some modicum of competence, I have been a vocal proponent for them. However, over the recent few years my thoughts on this have undergone significant transformation.</p>

<p>The first trigger to me questioning my embrace of step debuggers was when a colleague at work I respect immensely, and an old college mate of mine that I regard highly, both described step debugging as ‘boring’, in separate conversations with me. At the time, I suppose I didn’t get the full import of what they were trying to convey. To me, getting stuff done as quick as possible seemed more important than turning every bug hunting session into a kind of intellectual joust.</p>

<p>Later, I stumbled on two <a href="https://news.ycombinator.com/item?id=19829435">separate</a> <a href="https://news.ycombinator.com/item?id=19829435">posts</a> on HackerNews and the comment storms that ensued in both cases, extensively documents problems that others have with the use of debuggers. Both of these again got me grappling with my own views on the matter.</p>

<p>Now I eschew debuggers if I can. However, at this point, if you are forming a protest on why you have found debuggers so useful to you, let me try and offer concessions on 4 scenarios I have found where debuggers are indeed indispensable.</p>

<h2 id="exploring-a-new-codebase">Exploring a New Codebase</h2>

<p>The most genuine use for a debugger I’ve found is when reading a new codebase. I argue that stepping through code, inspecting the variables at key breakpoints, gives you much more insight into what is happening than just having to scan the code.</p>

<h2 id="grappling-with-object-oriented-state">Grappling with ‘Object Oriented’ State</h2>

<p>Something I realized a few years back was that use of a debugger was typically required in software that was written in a very ‘object oriented’ fashion. My first job, and the first couple of years in the one after that, were C++ shops doing game engines/computer graphics/scientific computing and I found that it was nigh impossible to navigate the complex interactions that arose from various objects mutating their internal state, and calling methods on each other.</p>

<p>Of course at that point my development style had transformed to rely on debuggers so much that I would be crippled when I had to make do with just an IDE that could compile and run code, but not do step debugging.</p>

<p>Later, as I moved to the web development side of things, I realized that I had been using debuggers as a crutch to work around code bases that were inherently hard to reason about. To be clear, these code bases were written by C++ experts, ran <code class="language-plaintext highlighter-rouge">cppcheck</code>, and generally followed most of the ever expanding hygiene practices prevalent in the C++ world. The code was not hard to read. Variables were well named. The code was structured reasonably well enough: they followed all the fancy ‘design patterns’ which you were expected to grok when it was introduced to you in the company’s orientation.</p>

<p>The trouble was that graphics engines/game engines are extremely stateful applications. And this state, distributed across a few thousand disparate black boxes quickly became a nightmare to hold in one’s head and reason about. As I started doing full stack web development, and more importantly, on the frontend, started working with functional programming paradigms, with all application state abstracted as a single central store (a la Redux), I found that my cognitive load to reason about applications was dramatically lowered. I realized that I was automatically relying far less on debuggers, especially when working on code in the React/Redux ecosystem that put emphasis on turning everything into small pure functions and composing them together.</p>

<h2 id="dynamically-typed-languages">Dynamically Typed Languages</h2>

<p>However, I still find myself relying on debuggers occasionally, but for an altogether different reason: lack of type information.</p>

<p>Since there’s no way for you to tell what the shape of inputs to JS/Python functions are, I am forced to again open up the code in debuggers to investigate when errors occur. In fact, it’s nigh impossible to reason about JS bugs without a debugger, even when the code is stateless, even when it’s just pure functions calling each other. The problem stems from the fact that you don’t know the shape of inputs to those functions, and when one of the params input is <code class="language-plaintext highlighter-rouge">null</code> or <code class="language-plaintext highlighter-rouge">undefined</code>.</p>

<p>This is in contrast to a type-safe functional language like Haskell that I dabbled with in my spare time, where I never feel the need for a debugger. I realize now that while functional programming itself makes it easier to reason about state and mutations, dynamically typed languages still necessitate a debugger to reason about the shape of data when things go awry.</p>

<h2 id="imperatively-written-code">Imperatively Written Code</h2>

<p>In addition to the above two scenarios, there is another kind of code that necessitates use of debuggers: long imperatively written functions, full of while loops and mutations of counters and flags. More simply speaking: bad code. I have run up against this in code-bases that were written in both static typed (C++, Golang, Java) and dynamically typed languages (Javascript, Python).</p>

<p>While one strives to write elegant, simple code, more often that not you run up against an existing code-base full of complex functions like the above that it is now your responsibility to fix a bug in. Without a debugger, this will typically require you to construct a complex state machine in your head.</p>

<p>Some people love doing this. I find it an exercise in masochism. Life is too short to be expending intellectual effort on fixing shitty code. I’d rather spin up a debugger to troubleshoot a bug and squash it ASAP, or just refactor the existing code into easier to reason about smaller chunks. In the real world, most of the time, that refactor and the necessary re-testing of all the flows in that new code is just a luxury you cannot afford. So I prefer to just fix the bug and move on. The important thing I’ve found in these situations is to NOT be too eager to ‘be done with it’ either. As with all things in life, you need to strike a balance between narrowing down on the problem, and figuring out the overall context of that code, shitty as it may be, so as to not introduce new bugs in the process.</p>

<h2 id="the-no-debugger-straitjacket">The No Debugger Straitjacket</h2>

<p>These days when I write code, I put myself in an intellectual straitjacket: I refuse to use a debugger though I always have a debugger accessible to me. I don’t use naive <code class="language-plaintext highlighter-rouge">print</code> statements (or its equivalent) either.</p>

<p>Instead, when I do need to troubleshoot a bug, I add logs. Specifically, debug logs. Copious amounts of it. And I any debug log I add, I leave it in code. After all, they only manifest when you turn <code class="language-plaintext highlighter-rouge">LOG_LEVEL</code> to the necessary verbosity.</p>

<p>This straitjacket has turned out to be a force for good. Whenever the logic gets hairy, I automatically start decomposing it into smaller, and separate functions that make the code easier to reason about. Of course, am a big fan of unit tests, and they are a force multiplier of their own.</p>

<h2 id="final-thoughts">Final Thoughts</h2>

<p>There is a quote from Bob Martin that I think is particularly insightful in this regard:</p>

<blockquote>
  <p>I consider debuggers to be a drug – an addiction. Programmers can get into the horrible habit of depending on the debugger instead of on their brain. IMHO a debugger is a tool of last resort. Once you have exhausted every other avenue of diagnosis, and have given very careful thought to just rewriting the offending code, then you may need a debugger.</p>
</blockquote>

<p>I think this just about sums up my thoughts on the matter. I also think this to be the reasonable middle ground in the ‘should we or shouldn’t we’ as concerns debuggers: use a debugger if you have to, but once you are doing that, acknowledge that you are dealing with inherent shortcomings of your code base: complexity in state management, lack of types, or just outright poorly written code.</p>

<p>My rules of thumb are, spin up a debugger only if:</p>

<ul>
  <li>you are trying to understand a new code-base.</li>
  <li>you are grappling with extensive state management in an ‘object oriented’ codebase.</li>
  <li>you are trying to figure out the shape of a piece of data in a dynamically typed language.</li>
  <li>you are troubleshooting a bug in a poorly written piece of code that is not worth refactoring at the moment.</li>
</ul>]]></content><author><name>Balajee RamaChandran</name></author><summary type="html"><![CDATA[My thought on use of debuggers is something of an evolving opinion.]]></summary></entry><entry><title type="html">On Project Management</title><link href="https://balajeerc.info/On-Project-Management/" rel="alternate" type="text/html" title="On Project Management" /><published>2019-05-27T00:00:00+00:00</published><updated>2019-05-27T00:00:00+00:00</updated><id>https://balajeerc.info/On-Project-Management</id><content type="html" xml:base="https://balajeerc.info/On-Project-Management/"><![CDATA[<p>I hear about Agile sometimes from co-workers, spoken either with reverence or absolute disgust. And no discussion on Agile is complete without a rant on its cornerstone: scrums.</p>

<p>I thought I’d mention here the <strong>only</strong> project that I ever worked on where any sort of project management technique that was deployed <em>actually worked</em>.</p>

<p>I once worked as a team lead on a project (in a different company) that had over-run its costs and was on the verge of cancellation because the PM (who was also the scrum master) gave us the wrong requirements document. (<em>dramatic pause</em>)</p>

<p>The CEO came over to me and asked me to take over, what with us now not having a PM till they could hire another. I used this opportunity to push for a change to our scrum process, which I felt was not working (though it had nothing to with the aforementioned derailment) and asked him to give me leeway to follow a different process.</p>

<p>Cut to 3 months down the line, we managed to wrap up the project. Here were the changes we initiated:</p>

<ul>
  <li>Standup meetings sucked because everyone wanted to talk about minutiae in their stories, and consult with others on technical issues they were grappling with. We cut it down to each dev reporting one thing and one thing only: was the story he was working on proceeding as expected. If it wasn’t, what was the new ETA of the story? Devs could say: “Gosh, I have no idea now that this bottleneck is stymying me”. I’d enter a large number in the num days left column of that story and move on. I’d revisit them as described in  below.</li>
  <li>Previously, devs were expected to update their own story estimates every day on an online tool. This didn’t happen. Also, devs didn’t want to look like they were taking “too long” for something, so they’d estimate optimistically and overshoot. Instead, I used my discretion and updated all the story estimates in a Microsoft Project doc myself every day. When a junior dev said it would take him 8 hours more, I’d tell him: “let’s just make it 16 hours more to be safe”. The key change was: I did the timeline updates for all team members by myself and did it religiously.</li>
  <li>This resulted in a system where I would send the Project docs to the CEO every Friday. Now the CEO could clearly see possible problem areas of over-run as soon as they occurred. This thus enabled both him and me to respond to them quickly: either assign more resources to our team for short durations or rejig more problematic stories to the most senior devs.</li>
</ul>

<p>The biggest lesson I learned in the process was that managers/execs don’t all necessarily want to keep breathing down your neck, or make you run faster than you want to. They just want to be told  <em>as soon</em>  as problems occur rather than letting delays build. Usually, once they knew what the issues were, they were very helpful. It was also a virtuous cycle: the execs now trusted the team’s ability to execute, and the team, in turn, felt more confident to turn to the execs for help when there were bottlenecks.</p>

<p>I also learnt that Microsoft Project is a planner’s best friend. Not so much the Gantt chart but the timeline view.</p>]]></content><author><name>Balajee RamaChandran</name></author><summary type="html"><![CDATA[I hear about Agile sometimes from co-workers, spoken either with reverence or absolute disgust. And no discussion on Agile is complete without a rant on its cornerstone: scrums.]]></summary></entry><entry><title type="html">Lessons Learnt Last Year</title><link href="https://balajeerc.info/Lessons-Learnt-Last-Year/" rel="alternate" type="text/html" title="Lessons Learnt Last Year" /><published>2019-01-01T00:00:00+00:00</published><updated>2019-01-01T00:00:00+00:00</updated><id>https://balajeerc.info/Lessons-Learnt-Last-Year</id><content type="html" xml:base="https://balajeerc.info/Lessons-Learnt-Last-Year/"><![CDATA[<p>The following will be a short post, introspecting on the lessons learnt over the past year.</p>

<h2 id="lesson-1-learning-how-to-learn">Lesson 1: Learning How to Learn</h2>

<p>Possibly the most important lesson was a meta lesson of how to learn. I am an advocate of learning new tech by just building something and <code class="language-plaintext highlighter-rouge">stackoverflow</code>-ing your way through it. However, I now realize it’s much more optimal to pick a comprehensive book on the topic, and stick to the intellectual straightjacket of looking up everything in that book.</p>

<p>That means that you don’t just Google for the answer, but you look up what you are trying to do in the Index of the book (remember those?)</p>

<p>The primary benefits I see to this process are:</p>

<ul>
  <li>looking up a book takes you to the same sections/snippets of code, reinforcing your memory rather than a Google search where you end up at a different link each time</li>
  <li>the more important benefit is the coherence in your mental mapping of the entire framework/body of knowledge, rather than fragmented tidbits you tend to gather by just Googling for a result</li>
  <li>another very important benefit is that while you lookup something in a book, you learn something else that was related that you didn’t know about, which affects your decision.</li>
</ul>

<h2 id="lesson-2-typesafety-is-indispensable">Lesson 2: Typesafety is Indispensable</h2>

<p>I’ll probably write a longer post on this, but long story short, am never picking a language without compile time type checking if I have a choice.</p>

<h2 id="lesson-3-learn-to-stop-worrying-and-love-sql">Lesson 3: Learn to Stop Worrying and Love SQL</h2>

<p>I should probably reword this as: stop bothering with ORMs and love SQL.
I have always hated SQL since I found the grammar very ugly. I hated it because it wasn’t composable. I hated it because it just had too many primitives to hold in your head.
I don’t anymore.
Stuff that hangs around, and dominate for over a couple of decades need to have some merit.
Once you get over the initial repulsion and start grokking it, the power it brings you is magnificent. 
Having a decent SQL IDE(?) like DBeaver also helps.</p>

<h2 id="lesson-4-pick-declarative-apis-over-imperative-ones">Lesson 4: Pick Declarative APIs over Imperative Ones</h2>

<p>We needed a job scheduler. A replacement for cron really. We started out with <code class="language-plaintext highlighter-rouge">celery</code> as the default choice. 
Turned out it was complex and not worth the effort. Celery had bugs involving the way it schedules the next job before the previous one completes. And the worst part is that the workers routinely crash.
I wish I remembered the details of the bug so that I could link to the celery bug report, but I don’t.</p>

<p>Anyway, long story short, we switched to a different scheduler and queue manager named <a href="https://mrq.readthedocs.io/">mrq</a>.</p>

<p>While we didn’t switch for this specific reason, it’s API’s just consumed a JSON input specifying the schedule and settings.</p>

<p>The result was that we could programmatically generate rather complex schedules and avoid a lot of boilerplate.</p>

<p>This pattern seems to recur over and over again, whether it be in specifying UI frontend elements (jQuery, Angular and other old frameworks vs React), or picking schedulers or programming languages, whatever tends to be more declarative always seems to be more easy to reason about, more composable and maintainable.</p>

<h2 id="lesson-5-vscode">Lesson 5: VSCode</h2>

<p>I have always relied on vim to do most of the heavy lifting when it comes to code editing. I used to stub my nose at people relying on more “modern” editors.</p>

<p>However, the advent of Golang into my life has forced me to reconsider this. Code completion, debugging and jumping to symbol using golang plugins for vim have been less than satisfactory. I reluctantly bit the bullet and started tinkering around with VSCode, telling myself it’s only going to be for golang editing.</p>

<p>It was rough starting out. The vim plugin didn’t exactly work like I wanted it. I hated having to use the mouse to switch between editor and the terminal. I hated that I couldn’t use vim bindings in my terminal. I hated that I had to use arrow keys to move between intellisense suggestions. And omg, do you really have to take up so much screen real estate with that sidebar and status bar and half a dozen toolbars I never see myself using?</p>

<p>However, after a weekend of tinkering with the keymaps, I came to the revelation that VSCode is <strong>incredibly</strong> customizable. Using keybindings.json, I could customize <em>every little detail</em> to work exactly the way I wanted. You’d have to try it to believe it. Switch to <code class="language-plaintext highlighter-rouge">hjkl</code> to move around windows. Check. Use <code class="language-plaintext highlighter-rouge">jk</code> to move up and down intellisense suggestions. Check. Want a custom terminal that you love in place of the default console? Sure, no probs. Don’t wan’t UI clutter? Use ‘Zen’ mode. Wait, but I want the status bar in Zen mode but not the menu bar. Yes, can do.</p>

<p>And the best part is that all the settings and keybindings can be managed via files. That I can now version control.</p>

<!--stackedit_data:
eyJoaXN0b3J5IjpbLTY4ODcxMTYwOCwtMjMwNzgxNjc3LDE4NT
A2NDk2MzMsLTE4MzE5ODg5LDE3MTg1NjUxMTZdfQ==
-->]]></content><author><name>Balajee RamaChandran</name></author><summary type="html"><![CDATA[The following will be a short post, introspecting on the lessons learnt over the past year.]]></summary></entry><entry><title type="html">Rooting Out Cpu Bottlenecks From Asyncio Based Api Services</title><link href="https://balajeerc.info/Rooting-out-CPU-Bottlenecks-from-asyncio-based-API-services/" rel="alternate" type="text/html" title="Rooting Out Cpu Bottlenecks From Asyncio Based Api Services" /><published>2018-11-09T00:00:00+00:00</published><updated>2018-11-09T00:00:00+00:00</updated><id>https://balajeerc.info/Rooting-out-CPU-Bottlenecks-from-asyncio-based-API-services</id><content type="html" xml:base="https://balajeerc.info/Rooting-out-CPU-Bottlenecks-from-asyncio-based-API-services/"><![CDATA[<p>At work, we use a Python Tornado based API server that now uses an <code class="language-plaintext highlighter-rouge">asyncio</code>  based event loop. Event loop based API servers (like Tornado via python asyncio, and expressJS on Node) excel at is handling large volumes of traffic that are primarily IO bound (i.e. for services that mostly just read data from DB/cache, do a few transformations on them, and send them back to the client).</p>

<p>What event loops are quite terrible for are tasks that involve a lot of CPU usage. One of the best write-ups on what to do and NOT do with event loop based services is the guide titled <a href="https://nodejs.org/en/docs/guides/dont-block-the-event-loop/">Don’t Block the Event Loop</a> that is part of the NodeJS docs. While a lot of it specific to NodeJS, a lot of the advice is valid for any event loop based API server, such as most Python frameworks that leverage <code class="language-plaintext highlighter-rouge">asyncio</code>.</p>

<p>Bottomline is that we need to guard against any of our services doing any CPU intensive processing as part of a request handler. A lot of the time this is obvious: crypto related tasks, image compression, data encoding etc. all throttle the CPU, but these things stick out quite obviously and the bottlenecks manifest at development time itself.</p>

<p>Then there are insidious CPU heavy tasks is JSON parsing or serialization. We make frequent use of <code class="language-plaintext highlighter-rouge">json.dumps</code>  and <code class="language-plaintext highlighter-rouge">json.loads</code>  and these tend to kill our server’s performance because JSON parsing/serialization is CPU intensive. Trouble is that JSON encoding/decoding slips under your nose during development, however, ends up resulting in your server loads spiking at scale in production.</p>

<p>This document illustrates how to profile a specific service to check if it is CPU bound to help with optimizing it.</p>

<h2 id="tool-setup">Tool Setup</h2>

<p>The following are the tools you need to do profiling of our services:</p>

<ul>
  <li><a href="https://httpd.apache.org/docs/2.4/programs/ab.html">Apache <code class="language-plaintext highlighter-rouge">ab</code></a>
    <ul>
      <li>This is the tool we’ll use to hit our local API instance with a load.</li>
      <li>On Ubuntu, you can install this by running: <code class="language-plaintext highlighter-rouge">apt-get install apache2-utils</code></li>
    </ul>
  </li>
  <li><a href="https://jiffyclub.github.io/snakeviz/">SnakeViz</a>
    <ul>
      <li>This is a tool used to visualize the file generated by Python’s  <code class="language-plaintext highlighter-rouge">cProfile</code>  profiler.</li>
      <li>Installed using: <code class="language-plaintext highlighter-rouge">pip install snakeviz</code></li>
    </ul>
  </li>
</ul>

<h2 id="profiling">Profiling</h2>

<p>The first step is to start the API server with profiling enabled. We use Python’s built-in <a href="https://docs.python.org/3.6/library/profile.html"><code class="language-plaintext highlighter-rouge">cProfile</code></a> profiler.</p>

<p>Here’s how to run the our API server with profiling enabled.</p>

<p><code class="language-plaintext highlighter-rouge">python -m cProfile -o instrument_details_profile.prf server.py</code></p>

<h2 id="load-testing">Load Testing</h2>

<p>The next step to the profiling process is to hit the server with a load. The following snippet demonstrates how to hit the API server with a load to the instruments details service.</p>

<p>Note that in case you have any authentication cookies that are needed to be sent to the service, you can add them via <code class="language-plaintext highlighter-rouge">-C</code> option that <code class="language-plaintext highlighter-rouge">ab</code> provides.</p>

<p><code class="language-plaintext highlighter-rouge">$ ab -n 10000 -c 20 https://api.sensibull.test/v1/instrument_details</code></p>

<p>You ought to end up with some results that look like this:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Benchmarking api.sensibull.test (be patient)
Completed 1000 requests
Completed 2000 requests
Completed 3000 requests
Completed 4000 requests
Completed 5000 requests
Completed 6000 requests
Completed 7000 requests
Completed 8000 requests
Completed 9000 requests
Completed 10000 requests
Finished 10000 requests


Server Software:        nginx/1.14.0
Server Hostname:        api.sensibull.test
Server Port:            443
SSL/TLS Protocol:       TLSv1.2,ECDHE-RSA-AES256-GCM-SHA384,2048,256
TLS Server Name:        api.sensibull.test

Document Path:          /v1/instrument_details
Document Length:        63 bytes

Concurrency Level:      20
Time taken for tests:   16.502 seconds
Complete requests:      10000
Failed requests:        0
Non-2xx responses:      10000
Total transferred:      2690000 bytes
HTML transferred:       630000 bytes
Requests per second:    606.00 [#/sec] (mean)
Time per request:       33.003 [ms] (mean)
Time per request:       1.650 [ms] (mean, across all concurrent reque
Transfer rate:          159.19 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        1    2   0.9      2      20
Processing:     1   31   6.3     30      72
Waiting:        1   31   6.3     29      72
Total:          3   33   6.5     31      75

Percentage of the requests served within a certain time (ms)
  50%     31
  66%     33
  75%     35
  80%     36
  90%     40
  95%     47
  98%     53
  99%     58
 100%     75 (longest request)
</code></pre></div></div>

<p><strong>Make sure that Non-2xx response count is 0.</strong>  Otherwise, you’d end up benchmarking the performance of your API server to dole out errors. Silly as this may sound, I committed this blunder a couple of times.</p>

<h2 id="visualizing-the-profiler-results">Visualizing the Profiler Results</h2>
<p>The generated profile file is a binary format that needs a visualizer to read. Snakeviz does a great job with this:</p>

<p><code class="language-plaintext highlighter-rouge">snakeviz instrument_details_profile.prf</code></p>

<p>You should get a result which looks something like this:</p>

<p><img src="/images/posts/profile_results_pre_optimize.png" alt="profile results pre optimization" /></p>

<h2 id="interpreting-the-profiler-results">Interpreting the Profiler Results</h2>

<p>The key result from the above visualization table, is the first entry when you sort by the <code class="language-plaintext highlighter-rouge">tottime</code>  column. This column shows the total time spent in the given function (and excluding time made in calls to sub-functions). Note that most of the time being spent in one function may not necessarily indicate a bottleneck. This time may simply be time that your service handler spent fetching a record from a table. While you may want to investigate why DB fetch times or time needed to retrieve cache data is high, time spent waiting for IO does not increase CPU loads.</p>

<p>The key is to identify which function specifically ends up consuming most of the execution time AND determine if that function is CPU intensive.</p>

<p>For eg. in the previous image snapshot from <code class="language-plaintext highlighter-rouge">snakeviz</code>, you see that most of the time was spent in <code class="language-plaintext highlighter-rouge">encoder.py</code>  from the Python standard library’s <code class="language-plaintext highlighter-rouge">json</code>  package. This is a huge hint as to the fact that this service is going to hurt your <code class="language-plaintext highlighter-rouge">asyncio</code>  event loop’s performance. The above result shows that most of the execution time was spent doing JSON serialization.</p>

<h2 id="removing-cpu-bottlenecks-from-async-services">Removing CPU Bottlenecks from async services</h2>

<p>Removing the CPU bottlenecks from your API pipeline requires some creativity. Some possibilities to consider are:</p>

<ul>
  <li>Can you change the shape of the data that you are storing in cache or DB so that you can send the data as is, as a string, without first parsing it and then paying again for serializing it? For eg. the solution we used at work was just to read from the cache and send the string that was read AS IS, without paying for the price of parsing it and then serializing it again to send it back to client.</li>
  <li>Can you offload the performance intensive part to another process altogether via RPC?</li>
  <li>Can you <a href="https://docs.python.org/3/library/asyncio-eventloop.html#executing-code-in-thread-or-process-pools">run the CPU bound function in a thread-pool or process-pool</a> that asyncio provides?</li>
</ul>

<p>Once you remove the bottlenecks, you should see the profiler showing the following result:</p>

<p><img src="/images/posts/profile_results_post_optimize.webp" alt="profile results pre optimization" /></p>

<p>Things are looking much better now. Note that most of the time is spent in <code class="language-plaintext highlighter-rouge">select.epoll</code>  which is basically <code class="language-plaintext highlighter-rouge">asyncio</code>  waiting on events from some IO device, be it DB socket or Redis socket. This is what asyncio excels at and this does not block the CPU like JSON encode/decode does.</p>

<p>Happy debugging!</p>]]></content><author><name>Balajee RamaChandran</name></author><summary type="html"><![CDATA[At work, we use a Python Tornado based API server that now uses an asyncio based event loop. Event loop based API servers (like Tornado via python asyncio, and expressJS on Node) excel at is handling large volumes of traffic that are primarily IO bound (i.e. for services that mostly just read data from DB/cache, do a few transformations on them, and send them back to the client).]]></summary></entry><entry><title type="html">On D3</title><link href="https://balajeerc.info/On-d3/" rel="alternate" type="text/html" title="On D3" /><published>2017-11-01T00:00:00+00:00</published><updated>2017-11-01T00:00:00+00:00</updated><id>https://balajeerc.info/On-d3</id><content type="html" xml:base="https://balajeerc.info/On-d3/"><![CDATA[<p>This post is going to be a collection of my observations about d3 - both its API and its source code as I explored using it for some personal work.</p>

<p>My requirements when coming to d3 was to draw charts. In fact, before actually getting my feet wet with it, all my interactions with other developers led me to categorize d3 as a charting API.</p>

<p>After actually exploring it however, I realize now that I was wrong. d3 is not a charting API so much as it is a generalized data visualization API. Specifically, d3 is an API to transform data into SVG data (and manipulate SVG DOM). I think we ought to stop refering to D3 as a “charting API” in common parlance and say something more accurate, like “D3 is to SVG (and to some extent, HTML too) what jQuery is to HTML”.</p>

<p>To put a finer point on it, when I started exploring d3, I expected to find a data driven API, basically a function that would accept some data, presumably in JSON and output a graph based on it. However, what I found instead is functional Javascript API that constructs SVG data using a series of transformations of input data.</p>

<h2 id="the-api">The API</h2>
<p>I decided to start with the API itself since that is what a d3 user, usually a developer encounters straight off the bat.</p>

<p>I am going to evaluate the API (and provide my observations thereof) based on three pre-determined criteria:</p>

<ul>
  <li>how declarative is the API?</li>
  <li>how often do “patterns” in use of the API repeat across multiple visualizations?</li>
</ul>

<h3 id="how-declarative-is-the-api">How declarative is the API?</h3>

<p>Consider the introductory tutorial code for creating a bar chart using d3:</p>

<div class="language-js highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">var</span> <span class="nx">data</span> <span class="o">=</span> <span class="p">[</span><span class="mi">30</span><span class="p">,</span> <span class="mi">86</span><span class="p">,</span> <span class="mi">168</span><span class="p">,</span> <span class="mi">281</span><span class="p">,</span> <span class="mi">303</span><span class="p">,</span> <span class="mi">365</span><span class="p">];</span>

<span class="nx">d3</span><span class="p">.</span><span class="nx">select</span><span class="p">(</span><span class="dl">"</span><span class="s2">.chart</span><span class="dl">"</span><span class="p">)</span>
  <span class="p">.</span><span class="nx">selectAll</span><span class="p">(</span><span class="dl">"</span><span class="s2">div</span><span class="dl">"</span><span class="p">)</span>
  <span class="p">.</span><span class="nx">data</span><span class="p">(</span><span class="nx">data</span><span class="p">)</span>
    <span class="p">.</span><span class="nx">enter</span><span class="p">()</span>
    <span class="p">.</span><span class="nx">append</span><span class="p">(</span><span class="dl">"</span><span class="s2">div</span><span class="dl">"</span><span class="p">)</span>
    <span class="p">.</span><span class="nx">style</span><span class="p">(</span><span class="dl">"</span><span class="s2">width</span><span class="dl">"</span><span class="p">,</span> <span class="kd">function</span><span class="p">(</span><span class="nx">d</span><span class="p">)</span> <span class="p">{</span> <span class="k">return</span> <span class="nx">d</span> <span class="o">+</span> <span class="dl">"</span><span class="s2">px</span><span class="dl">"</span><span class="p">;</span> <span class="p">})</span>
    <span class="p">.</span><span class="nx">text</span><span class="p">(</span><span class="kd">function</span><span class="p">(</span><span class="nx">d</span><span class="p">)</span> <span class="p">{</span> <span class="k">return</span> <span class="nx">d</span><span class="p">;</span> <span class="p">});</span>
</code></pre></div></div>

<p>An observation that can be made straight off the bat is that d3 is declarative - i.e. the use of the API involves “specifying” a series of transformations (chained functions) rather than issuing a series of function calls (in an imperative style).</p>

<p>Another point to note is that d3 does not just concern itself with generating SVG. The above example is generating a series of rectangular ‘div’ elements. So d3 does HTML DOM manipulation as well as SVG DOM manipulation.</p>

<h3 id="how-often-do-patterns-in-use-of-the-api-repeat-across-multiple-visualizations">How often do “patterns” in use of the API repeat across multiple visualizations?</h3>

<p>This question bears answering since the answer determines the ease of learning this API to the point where you can weild it to generate more customized visualizations.</p>

<p>The answer would be that primarily, learning of the d3 API involves learning of the ‘core’ d3 module, comprising most importantly of selections, transitions and data manipulation functions. The rest of the API comprises of a series of utilities: to interpolate data, perform geospatial calculations and transformations, with some often needed geometry primitives thrown in.</p>

<p>While I am no d3 expert, getting the d3 selection API down would pretty much let you weild d3 with ease, since these selection patterns are what you find repeating across visualizations.</p>

<h2 id="the-code">The Code</h2>

<h3 id="code-organization">Code Organization</h3>

<p>Instead of a monolith-library, d3 is an umbrella of distinct libraries with clear separation of concerns. At the top level is a repo named <code class="language-plaintext highlighter-rouge">d3/d3</code> which is primarily just a meta-repo pulling in all the other d3 dependency repos.</p>

<p>Each of the other libraries are in repos of their own. Some of them are:</p>

<ul>
  <li>d3/d3-core</li>
  <li>d3/d3-shape</li>
  <li>d3/d3-scale</li>
  <li>d3/d3-hierarchy</li>
  <li>…</li>
</ul>

<h3 id="code-exploration">Code Exploration</h3>

<p>As an exercise to determine whether d3 code is easily hackable, I decided to take a look at two functions that seem most intriguing in the d3 API: <code class="language-plaintext highlighter-rouge">data()</code> and <code class="language-plaintext highlighter-rouge">enter()</code>. To me these really seem the essence of the d3 API, because it is in these functions, where the magic of the correlation between the input data and the generated SVG/HTML DOM nodes seems to occur.</p>

<p>The other functions, as I see them, are a bunch of selectors and setters of HTML/SVG elements/attributes and the rest are utilities (like data interpolation and date formatting) which I can find other utility libraries to do anyway.</p>

<p>As a complete beginner to d3, it took me less than 30 seconds to track down the source code to <a href="https://github.com/d3/d3-selection/blob/master/src/selection/data.js"><code class="language-plaintext highlighter-rouge">data()</code></a>. That speaks volumes about how good the code organization is in this project.</p>

<p>It appears that most code in d3 is so modularized that most files never exceed 200 lines or so. That is again an impressive feat considering d3’s size.</p>

<p>The following was the code for the default exported function from the <code class="language-plaintext highlighter-rouge">data.js</code> file.</p>

<div class="language-js highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">export</span> <span class="k">default</span> <span class="kd">function</span><span class="p">(</span><span class="nx">value</span><span class="p">,</span> <span class="nx">key</span><span class="p">)</span> <span class="p">{</span>
  <span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="nx">value</span><span class="p">)</span> <span class="p">{</span>
    <span class="nx">data</span> <span class="o">=</span> <span class="k">new</span> <span class="nb">Array</span><span class="p">(</span><span class="k">this</span><span class="p">.</span><span class="nx">size</span><span class="p">()),</span> <span class="nx">j</span> <span class="o">=</span> <span class="o">-</span><span class="mi">1</span><span class="p">;</span>
    <span class="k">this</span><span class="p">.</span><span class="nx">each</span><span class="p">(</span><span class="kd">function</span><span class="p">(</span><span class="nx">d</span><span class="p">)</span> <span class="p">{</span> <span class="nx">data</span><span class="p">[</span><span class="o">++</span><span class="nx">j</span><span class="p">]</span> <span class="o">=</span> <span class="nx">d</span><span class="p">;</span> <span class="p">});</span>
    <span class="k">return</span> <span class="nx">data</span><span class="p">;</span>
  <span class="p">}</span>

  <span class="kd">var</span> <span class="nx">bind</span> <span class="o">=</span> <span class="nx">key</span> <span class="p">?</span> <span class="nx">bindKey</span> <span class="p">:</span> <span class="nx">bindIndex</span><span class="p">,</span>
      <span class="nx">parents</span> <span class="o">=</span> <span class="k">this</span><span class="p">.</span><span class="nx">_parents</span><span class="p">,</span>
      <span class="nx">groups</span> <span class="o">=</span> <span class="k">this</span><span class="p">.</span><span class="nx">_groups</span><span class="p">;</span>

  <span class="k">if</span> <span class="p">(</span><span class="k">typeof</span> <span class="nx">value</span> <span class="o">!==</span> <span class="dl">"</span><span class="s2">function</span><span class="dl">"</span><span class="p">)</span> <span class="nx">value</span> <span class="o">=</span> <span class="nx">constant</span><span class="p">(</span><span class="nx">value</span><span class="p">);</span>

  <span class="k">for</span> <span class="p">(</span><span class="kd">var</span> <span class="nx">m</span> <span class="o">=</span> <span class="nx">groups</span><span class="p">.</span><span class="nx">length</span><span class="p">,</span> <span class="nx">update</span> <span class="o">=</span> <span class="k">new</span> <span class="nb">Array</span><span class="p">(</span><span class="nx">m</span><span class="p">),</span> <span class="nx">enter</span> <span class="o">=</span> <span class="k">new</span> <span class="nb">Array</span><span class="p">(</span><span class="nx">m</span><span class="p">),</span> <span class="nx">exit</span> <span class="o">=</span> <span class="k">new</span> <span class="nb">Array</span><span class="p">(</span><span class="nx">m</span><span class="p">),</span> <span class="nx">j</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="nx">j</span> <span class="o">&lt;</span> <span class="nx">m</span><span class="p">;</span> <span class="o">++</span><span class="nx">j</span><span class="p">)</span> <span class="p">{</span>
    <span class="kd">var</span> <span class="nx">parent</span> <span class="o">=</span> <span class="nx">parents</span><span class="p">[</span><span class="nx">j</span><span class="p">],</span>
        <span class="nx">group</span> <span class="o">=</span> <span class="nx">groups</span><span class="p">[</span><span class="nx">j</span><span class="p">],</span>
        <span class="nx">groupLength</span> <span class="o">=</span> <span class="nx">group</span><span class="p">.</span><span class="nx">length</span><span class="p">,</span>
        <span class="nx">data</span> <span class="o">=</span> <span class="nx">value</span><span class="p">.</span><span class="nx">call</span><span class="p">(</span><span class="nx">parent</span><span class="p">,</span> <span class="nx">parent</span> <span class="o">&amp;&amp;</span> <span class="nx">parent</span><span class="p">.</span><span class="nx">__data__</span><span class="p">,</span> <span class="nx">j</span><span class="p">,</span> <span class="nx">parents</span><span class="p">),</span>
        <span class="nx">dataLength</span> <span class="o">=</span> <span class="nx">data</span><span class="p">.</span><span class="nx">length</span><span class="p">,</span>
        <span class="nx">enterGroup</span> <span class="o">=</span> <span class="nx">enter</span><span class="p">[</span><span class="nx">j</span><span class="p">]</span> <span class="o">=</span> <span class="k">new</span> <span class="nb">Array</span><span class="p">(</span><span class="nx">dataLength</span><span class="p">),</span>
        <span class="nx">updateGroup</span> <span class="o">=</span> <span class="nx">update</span><span class="p">[</span><span class="nx">j</span><span class="p">]</span> <span class="o">=</span> <span class="k">new</span> <span class="nb">Array</span><span class="p">(</span><span class="nx">dataLength</span><span class="p">),</span>
        <span class="nx">exitGroup</span> <span class="o">=</span> <span class="nx">exit</span><span class="p">[</span><span class="nx">j</span><span class="p">]</span> <span class="o">=</span> <span class="k">new</span> <span class="nb">Array</span><span class="p">(</span><span class="nx">groupLength</span><span class="p">);</span>

    <span class="nx">bind</span><span class="p">(</span><span class="nx">parent</span><span class="p">,</span> <span class="nx">group</span><span class="p">,</span> <span class="nx">enterGroup</span><span class="p">,</span> <span class="nx">updateGroup</span><span class="p">,</span> <span class="nx">exitGroup</span><span class="p">,</span> <span class="nx">data</span><span class="p">,</span> <span class="nx">key</span><span class="p">);</span>

    <span class="c1">// Now connect the enter nodes to their following update node, such that</span>
    <span class="c1">// appendChild can insert the materialized enter node before this node,</span>
    <span class="c1">// rather than at the end of the parent node.</span>
    <span class="k">for</span> <span class="p">(</span><span class="kd">var</span> <span class="nx">i0</span> <span class="o">=</span> <span class="mi">0</span><span class="p">,</span> <span class="nx">i1</span> <span class="o">=</span> <span class="mi">0</span><span class="p">,</span> <span class="nx">previous</span><span class="p">,</span> <span class="nx">next</span><span class="p">;</span> <span class="nx">i0</span> <span class="o">&lt;</span> <span class="nx">dataLength</span><span class="p">;</span> <span class="o">++</span><span class="nx">i0</span><span class="p">)</span> <span class="p">{</span>
      <span class="k">if</span> <span class="p">(</span><span class="nx">previous</span> <span class="o">=</span> <span class="nx">enterGroup</span><span class="p">[</span><span class="nx">i0</span><span class="p">])</span> <span class="p">{</span>
        <span class="k">if</span> <span class="p">(</span><span class="nx">i0</span> <span class="o">&gt;=</span> <span class="nx">i1</span><span class="p">)</span> <span class="nx">i1</span> <span class="o">=</span> <span class="nx">i0</span> <span class="o">+</span> <span class="mi">1</span><span class="p">;</span>
        <span class="k">while</span> <span class="p">(</span><span class="o">!</span><span class="p">(</span><span class="nx">next</span> <span class="o">=</span> <span class="nx">updateGroup</span><span class="p">[</span><span class="nx">i1</span><span class="p">])</span> <span class="o">&amp;&amp;</span> <span class="o">++</span><span class="nx">i1</span> <span class="o">&lt;</span> <span class="nx">dataLength</span><span class="p">);</span>
        <span class="nx">previous</span><span class="p">.</span><span class="nx">_next</span> <span class="o">=</span> <span class="nx">next</span> <span class="o">||</span> <span class="kc">null</span><span class="p">;</span>
      <span class="p">}</span>
    <span class="p">}</span>
  <span class="p">}</span>

  <span class="nx">update</span> <span class="o">=</span> <span class="k">new</span> <span class="nx">Selection</span><span class="p">(</span><span class="nx">update</span><span class="p">,</span> <span class="nx">parents</span><span class="p">);</span>
  <span class="nx">update</span><span class="p">.</span><span class="nx">_enter</span> <span class="o">=</span> <span class="nx">enter</span><span class="p">;</span>
  <span class="nx">update</span><span class="p">.</span><span class="nx">_exit</span> <span class="o">=</span> <span class="nx">exit</span><span class="p">;</span>
<span class="k">return</span> <span class="nx">update</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Now, am not sure if I have the full picture, but I would NOT approve a merge request featuring the above code. There were several instances where I went “Ugh” as I was reading this code:</p>

<ul>
  <li>variables accessed before they were declared (consider <code class="language-plaintext highlighter-rouge">data</code>). I understand that hoisting works and that this wouldn’t be problematic, but still, ugh.</li>
  <li>multiple assignment expressions in the same line: <code class="language-plaintext highlighter-rouge">data = new Array(this.size()), j = -1;</code></li>
  <li>
    <p>a completely unnecessary use of Array object constructor <code class="language-plaintext highlighter-rouge">data = new Array(this.size())</code>. As far as I know, the net result of:</p>

    <div class="language-js highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  <span class="nx">data</span> <span class="o">=</span> <span class="k">new</span> <span class="nb">Array</span><span class="p">(</span><span class="k">this</span><span class="p">.</span><span class="nx">size</span><span class="p">()),</span> <span class="nx">j</span> <span class="o">=</span> <span class="o">-</span><span class="mi">1</span><span class="p">;</span>
  <span class="k">this</span><span class="p">.</span><span class="nx">each</span><span class="p">(</span><span class="kd">function</span><span class="p">(</span><span class="nx">d</span><span class="p">)</span> <span class="p">{</span> <span class="nx">data</span><span class="p">[</span><span class="o">++</span><span class="nx">j</span><span class="p">]</span> <span class="o">=</span> <span class="nx">d</span><span class="p">;</span> <span class="p">});</span>
</code></pre></div>    </div>
    <p>is no different from:</p>
    <div class="language-js highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  <span class="nx">data</span> <span class="o">=</span> <span class="p">[],</span> <span class="nx">j</span> <span class="o">=</span> <span class="o">-</span><span class="mi">1</span><span class="p">;</span>
  <span class="k">this</span><span class="p">.</span><span class="nx">each</span><span class="p">(</span><span class="kd">function</span><span class="p">(</span><span class="nx">d</span><span class="p">)</span> <span class="p">{</span> <span class="nx">data</span><span class="p">[</span><span class="o">++</span><span class="nx">j</span><span class="p">]</span> <span class="o">=</span> <span class="nx">d</span><span class="p">;</span> <span class="p">});</span>
</code></pre></div>    </div>
    <p>simply because, unlike in other languages, which use dynamically resizing contiguous memory blocks for their vectors, JS arrays are just glorified hashtables, with a variable tracking the length of it.</p>
  </li>
  <li>Highly “stateful” logic in that there is extensive mutations of variables. Then again, this might done with performance in mind.</li>
</ul>

<p>Once you get past all the initial revulsion at the arcane coding style, what would strike one as puzzling is the use of <code class="language-plaintext highlighter-rouge">this</code> in a seemingly non-member function. However, once you recall that the <code class="language-plaintext highlighter-rouge">data</code> function is actually called on a collection of DOM elements that previous selectors returned, you realize that this function must be getting invoked on the collection. So, I started looking for where this gets bound to an object or an object prototype. A few greps later, I discovered in <code class="language-plaintext highlighter-rouge">selection/index.js</code>:</p>

<div class="language-js highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nx">Selection</span><span class="p">.</span><span class="nx">prototype</span> <span class="o">=</span> <span class="nx">selection</span><span class="p">.</span><span class="nx">prototype</span> <span class="o">=</span> <span class="p">{</span>
  <span class="na">constructor</span><span class="p">:</span> <span class="nx">Selection</span><span class="p">,</span>
  <span class="na">select</span><span class="p">:</span> <span class="nx">selection_select</span><span class="p">,</span>
  <span class="na">selectAll</span><span class="p">:</span> <span class="nx">selection_selectAll</span><span class="p">,</span>
  <span class="na">filter</span><span class="p">:</span> <span class="nx">selection_filter</span><span class="p">,</span>
  <span class="na">data</span><span class="p">:</span> <span class="nx">selection_data</span><span class="p">,</span>
  <span class="na">enter</span><span class="p">:</span> <span class="nx">selection_enter</span><span class="p">,</span>
  <span class="na">exit</span><span class="p">:</span> <span class="nx">selection_exit</span><span class="p">,</span>
  <span class="na">merge</span><span class="p">:</span> <span class="nx">selection_merge</span><span class="p">,</span>
  <span class="na">order</span><span class="p">:</span> <span class="nx">selection_order</span><span class="p">,</span>
  <span class="na">sort</span><span class="p">:</span> <span class="nx">selection_sort</span><span class="p">,</span>
  <span class="na">call</span><span class="p">:</span> <span class="nx">selection_call</span><span class="p">,</span>
  <span class="na">nodes</span><span class="p">:</span> <span class="nx">selection_nodes</span><span class="p">,</span>
  <span class="na">node</span><span class="p">:</span> <span class="nx">selection_node</span><span class="p">,</span>
  <span class="na">size</span><span class="p">:</span> <span class="nx">selection_size</span><span class="p">,</span>
  <span class="na">empty</span><span class="p">:</span> <span class="nx">selection_empty</span><span class="p">,</span>
  <span class="na">each</span><span class="p">:</span> <span class="nx">selection_each</span><span class="p">,</span>
  <span class="na">attr</span><span class="p">:</span> <span class="nx">selection_attr</span><span class="p">,</span>
  <span class="na">style</span><span class="p">:</span> <span class="nx">selection_style</span><span class="p">,</span>
  <span class="na">property</span><span class="p">:</span> <span class="nx">selection_property</span><span class="p">,</span>
  <span class="na">classed</span><span class="p">:</span> <span class="nx">selection_classed</span><span class="p">,</span>
  <span class="na">text</span><span class="p">:</span> <span class="nx">selection_text</span><span class="p">,</span>
  <span class="na">html</span><span class="p">:</span> <span class="nx">selection_html</span><span class="p">,</span>
  <span class="na">raise</span><span class="p">:</span> <span class="nx">selection_raise</span><span class="p">,</span>
  <span class="na">lower</span><span class="p">:</span> <span class="nx">selection_lower</span><span class="p">,</span>
  <span class="na">append</span><span class="p">:</span> <span class="nx">selection_append</span><span class="p">,</span>
  <span class="na">insert</span><span class="p">:</span> <span class="nx">selection_insert</span><span class="p">,</span>
  <span class="na">remove</span><span class="p">:</span> <span class="nx">selection_remove</span><span class="p">,</span>
  <span class="na">datum</span><span class="p">:</span> <span class="nx">selection_datum</span><span class="p">,</span>
  <span class="na">on</span><span class="p">:</span> <span class="nx">selection_on</span><span class="p">,</span>
  <span class="na">dispatch</span><span class="p">:</span> <span class="nx">selection_dispatch</span>
<span class="p">};</span>
</code></pre></div></div>

<p>Now it all made sense. The following was what I could glean from reading the rest of the code without any further information:</p>

<ul>
  <li>The <code class="language-plaintext highlighter-rouge">data</code> function is called with an array or object as its first argument.</li>
  <li>The values of this input data are “bound” to the selected nodes, i.e. a one-to-one mapping between the data and the selected nodes is established using the <code class="language-plaintext highlighter-rouge">bindKey</code> or <code class="language-plaintext highlighter-rouge">bindIndex</code> functions.</li>
  <li><code class="language-plaintext highlighter-rouge">bindKey</code> comes into play where the input data is itself an object, <code class="language-plaintext highlighter-rouge">bindIndex</code> comes into play where the input data is an array.</li>
  <li>What both the bind functions do is establish a mapping between the data and the input node selection. Each data item is stored in a member named <code class="language-plaintext highlighter-rouge">__data__</code> in the corresponding node.</li>
  <li>What is returned is a transformed selection of nodes, each of which is “aware” of the data it needs to render as part of itself.</li>
</ul>

<h4 id="the-not-so-clear-bits">The Not So Clear Bits</h4>

<ul>
  <li>There is repeated mentions of “entry nodes” and “exit nodes” which is not clear to me as of now considering this is my first reading of this code. Hopefully, poking around the rest of the files and API docs should make things clearer, failing which I can probably drop into the d3 developer IRC channels for help.</li>
</ul>

<h2 id="overall-impressions">Overall Impressions</h2>

<h3 id="what-i-like-about-d3">What I Like About d3</h3>

<ul>
  <li>I love the API. If I were creating a set of utilities to do manipulation of the DOM (be it HTML or SVG), this is exactly how I would structure it.</li>
  <li>The code modularization is really superb. A beginner can jump in and start deciphering the key bits almost immediately.</li>
  <li>I love the essence of the idea behind D3 API: take input numerical data, apply a series of transformations on it, till you have a visualization. There is a lot of functional programming zen in this approach.</li>
  <li>I admire the fact that the code is readable enough for a noob to jump in and start examining the key internals of the code within just a few minutes.</li>
</ul>

<h3 id="what-i-dont-like-about-d3">What I Don’t Like About d3</h3>

<ul>
  <li>The d3 umbrella of modules tries to be everything that its user needs. While each of these modules seem well written and organized, I feel that d3 authors suffer from the ‘IKEA syndrome’ or the ‘Not-Invented-Here’ syndrome, choosing to write even the most basic array manipulation library functions by themselves, instead of using any of the dozens out there. For eg. I really can’t fathom why a visualization API would feature:
    <ul>
      <li>date manipulation utilties (<code class="language-plaintext highlighter-rouge">d3.time</code>)</li>
      <li>data interpolation toolkit (it has <a href="https://github.com/d3/d3-3.x-api-reference/blob/master/Arrays.md">most of something like numpy in there</a>!)</li>
      <li><a href="https://github.com/d3/d3-3.x-api-reference/blob/master/Requests.md#d3_xhr">XHR utilties (seriously?!)</a></li>
    </ul>
  </li>
  <li>The code uses a lot of archaic, non-idiomatic JS which seems a bit out of place considering the use of ES6 constructs mixed in with it.</li>
</ul>]]></content><author><name>Balajee RamaChandran</name></author><summary type="html"><![CDATA[This post is going to be a collection of my observations about d3 - both its API and its source code as I explored using it for some personal work.]]></summary></entry><entry><title type="html">Good Oop Design Favours Less Oop</title><link href="https://balajeerc.info/Good-OOP-Design-Favours-Less-OOP/" rel="alternate" type="text/html" title="Good Oop Design Favours Less Oop" /><published>2017-09-24T00:00:00+00:00</published><updated>2017-09-24T00:00:00+00:00</updated><id>https://balajeerc.info/Good-OOP-Design-Favours-Less-OOP</id><content type="html" xml:base="https://balajeerc.info/Good-OOP-Design-Favours-Less-OOP/"><![CDATA[<p>If you work in a shop where they talk of object oriented programming in glowing terms, you probably got handed a copy of the ‘Design Patterns’ book by Eric Gamma et al., the so called “Gang of Four” or the ‘Head First Design Patterns’ book by Eric Freeman et al. Your colleagues probably pride themselves on how they can “think and communicate with each other” in all these repeatedly used abstractions called “design patterns”. Yeah, I know. I was one of them.</p>

<p>Having read through most of both those books, it appears that most of the “design patterns” are an advisory on how NOT to use the building blocks of OOP, the cornerstone among them being inheritance.</p>

<p>Let me illustrate with an example. Consider <a href="https://www.safaribooksonline.com/library/view/head-first-design/0596007124/ch01.html">the very first chapter of ‘Head First Design Patterns’</a>. It introduces the so called ‘Strategy Pattern’ using a quirky - and what the author hopes is a funny - example involving ducks.</p>

<p>The scenario is set up as follows. There is a company that makes a ‘Duck Simulator’ app. In the beginning, they have three kinds of ducks. The first stab at a simplistic design involves a Duck super class followed by two derived classes. The ducks differ only in their appearance; they swim and quack the same way.</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">class</span> <span class="nc">Duck</span> <span class="p">{</span>
<span class="nl">public:</span>
    <span class="k">virtual</span> <span class="kt">void</span> <span class="n">display</span><span class="p">()</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="c1">// Abstract, each duck displays differently</span>
    <span class="kt">void</span> <span class="n">swim</span><span class="p">()</span> <span class="p">{</span> <span class="c1">// ...swim like all ducks do }</span>
    <span class="kt">void</span> <span class="n">quack</span><span class="p">()</span> <span class="p">{</span> <span class="c1">// ...quack like all ducks do }</span>
<span class="p">};</span>
</code></pre></div></div>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">class</span> <span class="nc">MallardDuck</span> <span class="o">:</span> <span class="k">public</span> <span class="n">Duck</span> <span class="p">{</span>
<span class="nl">public:</span>
	<span class="kt">void</span> <span class="n">display</span><span class="p">()</span> <span class="p">{</span> <span class="c1">// ... Display a Mallard duck };</span>
<span class="p">};</span>
</code></pre></div></div>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">class</span> <span class="nc">RedheadDuck</span> <span class="o">:</span> <span class="k">public</span> <span class="n">Duck</span> <span class="p">{</span>
<span class="nl">public:</span>
	<span class="kt">void</span> <span class="n">display</span><span class="p">()</span> <span class="p">{</span> <span class="c1">// ... Display a redhead duck };</span>
<span class="p">};</span>
</code></pre></div></div>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">class</span> <span class="nc">RubberDuck</span> <span class="o">:</span> <span class="k">public</span> <span class="n">Duck</span> <span class="p">{</span>
<span class="nl">public:</span>
	<span class="kt">void</span> <span class="n">display</span><span class="p">()</span> <span class="p">{</span> <span class="c1">// ... Display a rubber duck };</span>
<span class="p">};</span>
</code></pre></div></div>
<p>But soon, the company’s executives come up with a customer requirement:  they need to make <em>some</em> of these ducks fly.</p>

<p>The first stab that the company’s engineer (“Joe”) takes at it is to implement a new function in the <code class="language-plaintext highlighter-rouge">Duck</code> class.</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">class</span> <span class="nc">Duck</span> <span class="p">{</span>
<span class="nl">public:</span>
    <span class="k">virtual</span> <span class="kt">void</span> <span class="n">display</span><span class="p">()</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="c1">// Abstract, each duck displays differently</span>
    <span class="kt">void</span> <span class="n">swim</span><span class="p">()</span> <span class="p">{</span> <span class="c1">// ...swim like all ducks do }</span>
    <span class="kt">void</span> <span class="n">quack</span><span class="p">()</span> <span class="p">{</span> <span class="c1">// ...quack like all ducks do }</span>
    <span class="kt">void</span> <span class="n">fly</span><span class="p">()</span> <span class="p">{</span> <span class="c1">// ... fly like all ducks do?? }</span>
<span class="p">};</span>
</code></pre></div></div>
<p>This allows for a flying duck, but as you can imagine this causes quite a fracas since Rubber ducks don’t fly in the real world. However, since <code class="language-plaintext highlighter-rouge">RubberDuck</code> also inherits from the Duck class that implements a common <code class="language-plaintext highlighter-rouge">fly</code> method, its instances of end up flying in the simulation. (Not good.)</p>

<p>So the next solution he comes up with is to make the <code class="language-plaintext highlighter-rouge">Duck</code> class an interface containing only abstract functions that are common to all ducks, moving all the implementations of all the functions to the concrete classes. In addition, since only some ducks can fly, he abstracts that away into a separate interface: <code class="language-plaintext highlighter-rouge">Flyable</code>.</p>

<p>So his new design looks as follows:</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">class</span> <span class="nc">Duck</span> <span class="p">{</span>
<span class="nl">public:</span>
    <span class="k">virtual</span> <span class="kt">void</span> <span class="n">display</span><span class="p">()</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
    <span class="k">virtual</span> <span class="kt">void</span> <span class="n">swim</span><span class="p">()</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
    <span class="k">virtual</span> <span class="kt">void</span> <span class="n">quack</span><span class="p">()</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="p">};</span>
</code></pre></div></div>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">class</span> <span class="nc">Flyable</span> <span class="p">{</span>
<span class="nl">public:</span>
	<span class="k">virtual</span> <span class="kt">void</span> <span class="n">fly</span><span class="p">()</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="p">};</span>
</code></pre></div></div>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">class</span> <span class="nc">MallardDuck</span> <span class="o">:</span> <span class="k">public</span> <span class="n">Duck</span><span class="p">,</span> <span class="k">public</span> <span class="n">Flyable</span> <span class="p">{</span>
	<span class="c1">// ... implementations of quack, swim, display and fly</span>
<span class="p">}</span>
</code></pre></div></div>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">class</span> <span class="nc">RubberDuck</span> <span class="o">:</span> <span class="k">public</span> <span class="n">Quack</span> <span class="p">{</span>
	<span class="c1">// ... implementations of quack, swim and display (but not fly) </span>
<span class="p">};</span>
</code></pre></div></div>
<p>While this works, there is a lot of duplication of code between various Duck implementations since the quack, fly and swim behaviour are identical between the ducks. (Actually, in the example even the quack behaviour is separated out as a separate interface so that it can be overridden in the <code class="language-plaintext highlighter-rouge">RubberDuck</code> to squeak, but I am going to ignore that detail for now).</p>

<p>Then the book introduces a “Design Principle”, highlighted in a box, complete with a picture of yin-yang to emphasize the zen in the statement: “Identify the aspects of your application that vary and separate them from what stays the same”.</p>

<p>The recommendation that follows is to extract the various “behaviours” of ducks from the definitions of the ducks themselves. The final design they come up with at the end is as (diagram straight from the book):</p>

<p><img src="https://www.safaribooksonline.com/library/view/head-first-design/0596007124/figs/web/022fig01.png.jpg" alt="The Grand Duck Design" /></p>

<p>The chapter finally ends with more zen, advocating the principle ‘Composition is better than inheritance’. It also congratulates the user over having learnt a new pattern: <strong>strategy pattern</strong>.</p>

<p>Ok, now let’s take a step back and see what just happened here:</p>

<ol>
  <li>Joe Engineer loves OOP so much that that’s how he models everything.</li>
  <li>The authors then come over and tell Joe to do three things:
    <ul>
      <li>Extract the behaviours from the class into separate classes.</li>
      <li>Not use derive from a super class, rather to program to interfaces instead. Heck, don’t use inheritance at all!</li>
      <li>Program to an interface (again, don’t inherit from a concrete class).</li>
      <li>Favor composition over inheritance. (For the last time, inheritance sucks ok? Just don’t.)</li>
    </ul>
  </li>
  <li>They come up with a new “design pattern” to solve the problem still using classes as their units of abstraction.</li>
</ol>

<p>Let’s rephrase their most significant advice without losing its essence:</p>
<ul>
  <li>Separate procedures from data</li>
  <li>Inheritance is bollocks</li>
  <li>Interfaces are great</li>
</ul>

<p>Well whadayya know! The best sagelike OOP advice is to do less of it and think more like functional programmers do.</p>

<p>What distinctive characteristic of OOP do you have left once you remove inheritance, separation of object behaviour from object structure, and just use interfaces to abstract a “class” of objects that have a common set of behaviours?</p>

<p>Ans: You get functional programming a la Haskell:</p>

<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">{-# LANGUAGE DeriveAnyClass #-}</span>

<span class="c1">-- Define a few placeholder types that for each of</span>
<span class="c1">-- the duck operations. </span>
<span class="c1">-- Just strings for now, but these can be as complex</span>
<span class="c1">-- as necessary</span>
<span class="kr">type</span> <span class="kt">FlightResult</span> <span class="o">=</span> <span class="kt">String</span> 
<span class="kr">type</span> <span class="kt">SwimResult</span> <span class="o">=</span> <span class="kt">String</span>
<span class="kr">type</span> <span class="kt">QuackResult</span> <span class="o">=</span> <span class="kt">String</span>
<span class="kr">type</span> <span class="kt">DisplayResult</span> <span class="o">=</span> <span class="kt">String</span>

<span class="c1">-- Define a typeclass for what all Ducks ought to </span>
<span class="c1">-- be able to do</span>
<span class="kr">class</span> <span class="kt">DuckLike</span> <span class="n">a</span> <span class="kr">where</span>
    <span class="n">fly</span> <span class="o">::</span> <span class="n">a</span> <span class="o">-&gt;</span> <span class="kt">FlightResult</span>
    <span class="n">swim</span> <span class="o">::</span> <span class="n">a</span> <span class="o">-&gt;</span> <span class="kt">SwimResult</span>
    <span class="n">quack</span> <span class="o">::</span> <span class="n">a</span> <span class="o">-&gt;</span> <span class="kt">QuackResult</span>
    <span class="n">display</span> <span class="o">::</span> <span class="n">a</span> <span class="o">-&gt;</span> <span class="kt">DisplayResult</span>

<span class="c1">-- Write some generic actions that will be </span>
<span class="c1">-- shared across/common to all ducks</span>
<span class="n">genericQuack</span> <span class="o">::</span> <span class="kt">QuackResult</span>
<span class="n">genericQuack</span> <span class="o">=</span> <span class="s">"*quack quack*"</span>

<span class="n">genericSwim</span> <span class="o">::</span> <span class="kt">SwimResult</span>
<span class="n">genericSwim</span> <span class="o">=</span> <span class="s">"*paddle paddle*"</span>

<span class="n">genericFly</span> <span class="o">::</span> <span class="kt">FlightResult</span>
<span class="n">genericFly</span> <span class="o">=</span> <span class="s">"*flap flap whoooosh*"</span>

<span class="c1">-- Define types for the various ducks as instancing the Duck typeclass</span>
<span class="kr">data</span> <span class="kt">MallardDuck</span> <span class="o">=</span> <span class="kt">MallardDuck</span> <span class="p">{</span> <span class="n">mallardName</span> <span class="o">::</span> <span class="kt">String</span> <span class="p">}</span>
    <span class="kr">deriving</span> <span class="p">(</span><span class="kt">Show</span><span class="p">)</span>
<span class="kr">instance</span> <span class="kt">DuckLike</span> <span class="kt">MallardDuck</span> <span class="kr">where</span>
    <span class="n">fly</span> <span class="kr">_</span> <span class="o">=</span> <span class="n">genericFly</span>
    <span class="n">swim</span> <span class="kr">_</span> <span class="o">=</span> <span class="n">genericSwim</span>
    <span class="n">quack</span> <span class="kr">_</span> <span class="o">=</span> <span class="n">genericQuack</span>
    <span class="n">display</span> <span class="n">duck</span> <span class="o">=</span> <span class="s">"Hi, I am a Mallard Duck"</span> <span class="o">++</span> <span class="p">(</span><span class="n">mallardName</span> <span class="n">duck</span><span class="p">)</span>

<span class="kr">data</span> <span class="kt">RedheadDuck</span> <span class="o">=</span> <span class="kt">RedheadDuck</span> <span class="p">{</span>  <span class="n">redheadName</span> <span class="o">::</span> <span class="kt">String</span> <span class="p">}</span>
    <span class="kr">deriving</span> <span class="p">(</span><span class="kt">Show</span><span class="p">)</span>
<span class="kr">instance</span> <span class="kt">DuckLike</span> <span class="kt">RedheadDuck</span> <span class="kr">where</span>
    <span class="n">fly</span> <span class="kr">_</span> <span class="o">=</span> <span class="n">genericFly</span>
    <span class="n">swim</span> <span class="kr">_</span> <span class="o">=</span> <span class="n">genericSwim</span>
    <span class="n">quack</span>  <span class="kr">_</span> <span class="o">=</span> <span class="n">genericQuack</span>
    <span class="n">display</span> <span class="n">duck</span> <span class="o">=</span> <span class="s">"Hello there, I am a Redhead Duck named "</span> <span class="o">++</span> <span class="p">(</span><span class="n">redheadName</span> <span class="n">duck</span><span class="p">)</span>

<span class="c1">-- Note how we don't use the generic fly and squeak</span>
<span class="c1">-- functions for RubberDuck</span>
<span class="kr">data</span> <span class="kt">RubberDuck</span> <span class="o">=</span> <span class="kt">RubberDuck</span> <span class="p">{</span> <span class="n">rubberName</span> <span class="o">::</span> <span class="kt">String</span> <span class="p">}</span>
    <span class="kr">deriving</span> <span class="p">(</span><span class="kt">Show</span><span class="p">)</span>
<span class="kr">instance</span> <span class="kt">DuckLike</span> <span class="kt">RubberDuck</span> <span class="kr">where</span>
    <span class="n">fly</span> <span class="kr">_</span> <span class="o">=</span> <span class="s">"Can't fly. Can't do anything really. Sigh..."</span>
    <span class="n">swim</span> <span class="kr">_</span> <span class="o">=</span> <span class="n">genericSwim</span>
    <span class="n">quack</span> <span class="kr">_</span> <span class="o">=</span> <span class="s">"*Squeak squeak*"</span>
    <span class="n">display</span> <span class="n">duck</span> <span class="o">=</span> <span class="s">"I am a just a stupid rubber duck named "</span> <span class="o">++</span> <span class="p">(</span><span class="n">rubberName</span> <span class="n">duck</span><span class="p">)</span>
</code></pre></div></div>
<p>Running this in <code class="language-plaintext highlighter-rouge">ghci</code> gives:</p>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>*Main&gt; let redgy = RedheadDuck "Redgy"
*Main&gt; quack redgy
"*quack quack*"
*Main&gt; fly redgy
"*flap flap whoooosh*"
*Main&gt; display redgy
"Hello there, I am a Redhead Duck named Redgy"
*Main&gt; let rubbaar = RubberDuck "Rubbaar"
*Main&gt; fly rubbaar
"Can't fly. Can't do anything really. Sigh..."
*Main&gt; quack rubbaar
"*Squeak squeak*"
*Main&gt; swim rubbaar
"*paddle paddle*"
*Main&gt; display rubbaar
"I am a just a stupid rubber duck named Rubbaar"
</code></pre></div></div>
<p>Note that we have accomplished all the zen goals stated in the book chapter:</p>
<ol>
  <li>We have achieved distinct types for each of the ducks, and yet all of them conform to an ‘interface’.</li>
  <li>We have extracted the functionality that is “most likely to change most often” from the core interfaces and types that won’t change as often. You can go on adding new types of ducks without having to change any existing code.</li>
  <li>We have managed not to duplicate common generic functions that are uniform across most ducks in each type instance.</li>
  <li>Specific types can implement functionality that is different from the generic behaviours.</li>
</ol>

<p>Heck, I could have implemented this in vanilla C++, with some templated functions.</p>

<h2 id="the-point-im-trying-to-make">The Point I’m trying to Make…</h2>
<p>OOP proponents unnecessarily model what are essentially procedures as “classes”. Then they come up with convoluted machinery called “design patterns” to try and work with them - all the while handing out sagelike advice to stop using inheritance, and extracting behaviours from objects - thus throwing away the very stuff of OOP.</p>

<p>At some point, it feels like the OOP crowd are doing this due to some weird masochistic urge to self flagellate. Anything, rather than just embracing that all the tomes of design pattern dogma they have internalized over the years are inferior to paradigms espoused by functional programming.</p>

<p>Let’s stop please?</p>]]></content><author><name>Balajee RamaChandran</name></author><summary type="html"><![CDATA[If you work in a shop where they talk of object oriented programming in glowing terms, you probably got handed a copy of the ‘Design Patterns’ book by Eric Gamma et al., the so called “Gang of Four” or the ‘Head First Design Patterns’ book by Eric Freeman et al. Your colleagues probably pride themselves on how they can “think and communicate with each other” in all these repeatedly used abstractions called “design patterns”. Yeah, I know. I was one of them.]]></summary></entry><entry><title type="html">My New Favourite Terminal Emulator: Vim</title><link href="https://balajeerc.info/My-New-Favourite-Terminal-Emulator-Vim/" rel="alternate" type="text/html" title="My New Favourite Terminal Emulator: Vim" /><published>2017-09-20T00:00:00+00:00</published><updated>2017-09-20T00:00:00+00:00</updated><id>https://balajeerc.info/My-New-Favourite-Terminal-Emulator:--Vim</id><content type="html" xml:base="https://balajeerc.info/My-New-Favourite-Terminal-Emulator-Vim/"><![CDATA[<p>I hate having to use the mouse. I do everything I can to change my workflow so that I never have to touch it. That includes using a <a href="/Tiling-Window-Managers">tiling window manager</a>, <a href="https://addons.mozilla.org/en-US/firefox/addon/vimfx">using Firefox with VimFx extension</a> so that I can use my browser just the way I use vim and using commandline apps rather than GUI apps for pretty much anything (vim, alpine, musikcube etc.)</p>

<p>However, the one thing that really forced me to use the mouse quite often was my terminal emulator. In most terminal emulators if you want to scroll up to see text, you’ll need to employ some uncomfortable keystrokes (Win + Shift + up/down in gnome-terminal). The most annoying scenario is when you want to select text from a terminal, to copy paste it somewhere else. The <a href="https://askubuntu.com/questions/302263/selecting-text-in-the-terminal-without-using-the-mouse">hoops you need to jump through in gnome-terminal to do this natively are ridiculous</a>. As of this writing, the only sane way to do this WITHOUT using the mouse is to always run your session via <code class="language-plaintext highlighter-rouge">screen</code>. <code class="language-plaintext highlighter-rouge">screen</code> is a great piece of software, but I really don’t want to muscle-memorize a whole slew of different shortcuts to change modes and select text when as a vim user, I have put in the effort of ingraining a functional workflow already.</p>

<p>So, that began my next quest: to find a termimal emulator that lets me use the vim workflow. Went through a bunch of terminal emulators trying to get them to fit the workflow I wanted. I hit dead ends repeatedly. I also tried using zsh’s Vi mode. Am probably going to use zsh as my shell, but its vi mode does not let me customize the shortcuts the way I like.</p>

<p>For eg. I use Cntrl+l and Cntrl+h to jump back and forth from start and ends of lines because I don’t like the vim defaults. Another indispensable cusotmization is using a quick double press of ‘j’ key to get out of INSERT mode. I just can’t live without them.</p>

<p>Also, frankly I think the job of managing shortcuts and workflow ought to be the responsibility of my terminal emulator, not my shell.</p>

<p>As is often in life, the perfect solution was always at hand, hiding in plain sight.</p>

<h2 id="if-you-want-vim-just-use-vim">If you want Vim, just use Vim!</h2>

<p><a href="https://neovim.io/">Neovim</a> is a really great vim fork I have been using for a while now. Turns out that Neovim <a href="https://neovim.io/doc/user/nvim_terminal_emulator.html">can open a terminal emulator buffer</a>. I had even been happily using this feature for the past couple months while developing - code in one buffer and the shell  in a split buffer. All that time, I was looking for a terminal emulator that I could use to replicate this convenient shell environment I had within nvim.</p>

<p>I feel like kicking myself for not seeing the obvious: why don’t I just use nvim with a single terminal buffer as my primary terminal emulator?</p>

<p>There were a few simple customizations I needed to do:</p>

<ul>
  <li>allow nvim to start in insert mode at startup (done with the <code class="language-plaintext highlighter-rouge">+startinsert</code> commandline option)</li>
  <li>make nvim start with a different stripped down configuration file because you don’t want to incur the slowdown at startup of loading heavy vim plugins each time you open a terminal. (done with the <code class="language-plaintext highlighter-rouge">-u</code> option)</li>
  <li>hide the vim status line at the bottom, just for cosmetic purposes so that the result looked like a typical terminal window rather than another vim instance. (Achieved this thanks to a helpful <a href="https://unix.stackexchange.com/questions/140898/vim-hide-status-line-in-the-bottom">Stackoverflow snippet</a>)</li>
</ul>

<p>I just had to make the following shell snippet my primary terminal emulator:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>nvim +startinsert <span class="nt">-u</span> <span class="nv">$HOME</span>/.i3/terminal.init.vim term://bash
</code></pre></div></div>

<p>where my <code class="language-plaintext highlighter-rouge">terminal.init.vim</code> file looks like:</p>

<div class="language-vim highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">set</span> <span class="nb">guifont</span><span class="p">=</span>Monospace\ <span class="m">12</span>

<span class="k">set</span> <span class="nb">t_Co</span><span class="p">=</span><span class="m">256</span>
<span class="k">set</span> <span class="nb">background</span><span class="p">=</span><span class="nb">dark</span>
<span class="nb">highlight</span> Normal ctermbg<span class="p">=</span><span class="nb">NONE</span>
<span class="nb">highlight</span> nonText ctermbg<span class="p">=</span><span class="nb">NONE</span>

<span class="c">" Make vim use X clipboard by default while doing yanks and pastes</span>
<span class="k">set</span> <span class="nb">clipboard</span><span class="p">=</span>unnamedplus

<span class="c">" Prevent vim from clearing out the clipboard on exit</span>
autocmd <span class="nb">VimLeave</span> * <span class="k">call</span> <span class="nb">system</span><span class="p">(</span><span class="s2">"xsel -ib"</span><span class="p">,</span> <span class="nb">getreg</span><span class="p">(</span><span class="s1">'+'</span><span class="p">))</span>

<span class="c">" Prevent vim from forcing you to save a changed buffer or using ! when</span>
<span class="c">" switching between buffers</span>
<span class="k">set</span> <span class="nb">hidden</span>

<span class="c">" Specify a directory for plugins (for Neovim: ~/.local/share/nvim/plugged)</span>
<span class="k">call</span> plug#begin<span class="p">(</span><span class="s1">'~/.vim/plugged'</span><span class="p">)</span>

<span class="c">" Make sure you use single quotes</span>
Plug <span class="s1">'junegunn/vim-easy-align'</span>
Plug <span class="s1">'junegunn/fzf'</span><span class="p">,</span> <span class="p">{</span> <span class="s1">'dir'</span><span class="p">:</span> <span class="s1">'~/.fzf'</span><span class="p">,</span> <span class="s1">'do'</span><span class="p">:</span> <span class="s1">'./install --all'</span> <span class="p">}</span>
Plug <span class="s1">'flazz/vim-colorschemes'</span>
Plug <span class="s1">'xolox/vim-misc'</span>
Plug <span class="s1">'qpkorr/vim-bufkill'</span>

<span class="c">" Add plugins to &amp;runtimepath</span>
<span class="k">call</span> plug#end<span class="p">()</span>

<span class="k">colorscheme</span> molokai

<span class="k">set</span> <span class="nb">relativenumber</span>
<span class="k">set</span> <span class="k">number</span>

nnoremap <span class="p">&lt;</span>C<span class="p">-</span>S<span class="p">-</span><span class="k">tab</span><span class="p">&gt;</span> <span class="p">:</span><span class="k">bprevious</span><span class="p">&lt;</span>CR<span class="p">&gt;</span>
nnoremap <span class="p">&lt;</span>C<span class="p">-</span><span class="k">tab</span><span class="p">&gt;</span>   <span class="p">:</span><span class="k">bnext</span><span class="p">&lt;</span>CR<span class="p">&gt;</span>

<span class="k">filetype</span> plugin <span class="nb">indent</span> <span class="k">on</span>
<span class="nb">syntax</span> enable

<span class="c">" Neovim's Python provider</span>
<span class="c">"let g:python_host_prog  = '/usr/local/bin/python3'</span>
<span class="c">"let g:python3_host_prog = '/usr/local/bin/python3'</span>

<span class="c">" Move up and down in autocomplete with &lt;c-j&gt; and &lt;c-k&gt;</span>
inoremap <span class="p">&lt;</span>expr<span class="p">&gt;</span> <span class="p">&lt;</span><span class="k">c</span><span class="p">-</span><span class="k">j</span><span class="p">&gt;</span> <span class="p">(</span><span class="s2">"\&lt;C-n&gt;"</span><span class="p">)</span>
inoremap <span class="p">&lt;</span>expr<span class="p">&gt;</span> <span class="p">&lt;</span><span class="k">c</span><span class="p">-</span><span class="k">k</span><span class="p">&gt;</span> <span class="p">(</span><span class="s2">"\&lt;C-p&gt;"</span><span class="p">)</span>

<span class="c">" Map jj to leave insert mode </span>
inoremap jj <span class="p">&lt;</span>esc<span class="p">&gt;</span>

<span class="c">" Easier jumping to first non-space and </span>
<span class="c">" last non-space letters in line</span>
inoremap <span class="p">&lt;</span>C<span class="p">-</span><span class="k">h</span><span class="p">&gt;</span> <span class="p">&lt;</span>C<span class="p">-</span><span class="k">o</span><span class="p">&gt;</span>^
inoremap <span class="p">&lt;</span>C<span class="p">-</span><span class="k">l</span><span class="p">&gt;</span> <span class="p">&lt;</span>C<span class="p">-</span><span class="k">o</span><span class="p">&gt;</span>g_
vmap <span class="p">&lt;</span>C<span class="p">-</span><span class="k">h</span><span class="p">&gt;</span> <span class="p">&lt;</span>Home<span class="p">&gt;</span>
vmap <span class="p">&lt;</span>C<span class="p">-</span><span class="k">l</span><span class="p">&gt;</span> <span class="p">&lt;</span>End<span class="p">&gt;</span>

<span class="c">" Easier jumping to home and end</span>
nnoremap <span class="p">&lt;</span>C<span class="p">-</span><span class="k">h</span><span class="p">&gt;</span> <span class="p">&lt;</span>Home<span class="p">&gt;</span>
nnoremap <span class="p">&lt;</span>C<span class="p">-</span><span class="k">l</span><span class="p">&gt;</span> <span class="p">&lt;</span>End<span class="p">&gt;</span>

<span class="c">" Disable Arrow keys in Escape mode</span>
<span class="nb">map</span> <span class="p">&lt;</span><span class="k">up</span><span class="p">&gt;</span> <span class="p">&lt;</span>nop<span class="p">&gt;</span>
<span class="nb">map</span> <span class="p">&lt;</span>down<span class="p">&gt;</span> <span class="p">&lt;</span>nop<span class="p">&gt;</span>
<span class="nb">map</span> <span class="p">&lt;</span><span class="k">left</span><span class="p">&gt;</span> <span class="p">&lt;</span>nop<span class="p">&gt;</span>
<span class="nb">map</span> <span class="p">&lt;</span><span class="k">right</span><span class="p">&gt;</span> <span class="p">&lt;</span>nop<span class="p">&gt;</span>

<span class="c">" Disable Arrow keys in Insert mode</span>
imap <span class="p">&lt;</span><span class="k">up</span><span class="p">&gt;</span> <span class="p">&lt;</span>nop<span class="p">&gt;</span>
imap <span class="p">&lt;</span>down<span class="p">&gt;</span> <span class="p">&lt;</span>nop<span class="p">&gt;</span>
imap <span class="p">&lt;</span><span class="k">left</span><span class="p">&gt;</span> <span class="p">&lt;</span>nop<span class="p">&gt;</span>

<span class="k">set</span> <span class="nb">directory</span><span class="p">=</span>$HOME<span class="sr">/.vim/</span>swapfiles<span class="sr">//</span>
<span class="k">set</span> <span class="nb">backupdir</span><span class="p">=</span>$HOME<span class="sr">/.vim/</span>backups<span class="sr">//</span>
imap <span class="p">&lt;</span><span class="k">right</span><span class="p">&gt;</span> <span class="p">&lt;</span>nop<span class="p">&gt;</span>

<span class="c">" Remap terminal mode toggle</span>
<span class="k">tnoremap</span> jj <span class="p">&lt;</span>C<span class="p">-</span>\<span class="p">&gt;&lt;</span>C<span class="p">-</span><span class="k">n</span><span class="p">&gt;</span>

<span class="c">" Set splitting modes</span>
<span class="k">set</span> <span class="nb">splitbelow</span>
<span class="k">set</span> <span class="nb">splitright</span>

<span class="c">" Keep mouse disabled</span>
<span class="k">set</span> <span class="nb">mouse</span><span class="p">=</span><span class="k">c</span>

<span class="c">" Remap bufkill from BD to cntrl-x</span>
<span class="nb">map</span> <span class="p">&lt;</span>C<span class="p">-</span><span class="k">x</span><span class="p">&gt;</span> <span class="p">:</span>BD<span class="p">&lt;</span><span class="k">cr</span><span class="p">&gt;</span>

<span class="c">" Code to hide the status line just for cosmetic purposes</span>
<span class="k">let</span> <span class="nv">s:hidden_all</span> <span class="p">=</span> <span class="m">0</span>
<span class="k">function</span><span class="p">!</span> ToggleHiddenAll<span class="p">()</span>
    <span class="k">if</span> <span class="nv">s:hidden_all</span>  <span class="p">==</span> <span class="m">0</span>
        <span class="k">let</span> <span class="nv">s:hidden_all</span> <span class="p">=</span> <span class="m">1</span>
        <span class="k">set</span> <span class="nb">noshowmode</span>
        <span class="k">set</span> <span class="nb">noruler</span>
        <span class="k">set</span> <span class="nb">laststatus</span><span class="p">=</span><span class="m">0</span>
        <span class="k">set</span> <span class="nb">noshowcmd</span>
    <span class="k">else</span>
        <span class="k">let</span> <span class="nv">s:hidden_all</span> <span class="p">=</span> <span class="m">0</span>
        <span class="k">set</span> <span class="nb">showmode</span>
        <span class="k">set</span> <span class="nb">ruler</span>
        <span class="k">set</span> <span class="nb">laststatus</span><span class="p">=</span><span class="m">2</span>
        <span class="k">set</span> <span class="nb">showcmd</span>
    <span class="k">endif</span>
<span class="k">endfunction</span>

<span class="k">call</span> ToggleHiddenAll<span class="p">()</span>
</code></pre></div></div>

<p>So what does the end result look like?</p>

<p><img src="/images/posts/nvim_as_terminal_emulator.png" alt="nvim as terminal emulator" /></p>

<p>Yes, I know how ironical it is to take a screenshot of an i3wm workspace with floating windows in it, but I needed the eye-candy for this post. :)</p>]]></content><author><name>Balajee RamaChandran</name></author><summary type="html"><![CDATA[I hate having to use the mouse. I do everything I can to change my workflow so that I never have to touch it. That includes using a tiling window manager, using Firefox with VimFx extension so that I can use my browser just the way I use vim and using commandline apps rather than GUI apps for pretty much anything (vim, alpine, musikcube etc.)]]></summary></entry></feed>