Recently I stumbled upon a simple yet powerful tool from Lido:
It's a CLI tool called Diffyscan. What is it for?
It lets you compare source code of verified smart contracts in Etherscan against a GitHub repository.
Is it a must-have for security researchers?
Let me show you how we used it in our latest security spotcheck of Lido v2.
The problem
We were starting out a security spotcheck for the deployed version of Lido V2 on Ethereum mainnet.
On the first day, I needed to define the scope by collecting all relevant contracts and their addresses.
Luckily, Lido publishes all their deployed contracts with addresses in a single place. This was a great starting point.
I crawled all addresses, then opened each one in Etherscan. Soon I realized that the "Code" view in Etherscan wouldn't be a sensible choice. I wanted to read huge contracts, take down notes, etc. I was missing an IDE.
How could I move from Etherescan verified source code in +10 addresses to my local testing environment?
I could've used this handy trick to move from Etherscan to an in-browser instance of VSCode thanks to deth.net. But that wouldn't have been enough.
I was dealing with lots of addresses with large contracts, libraries, external dependencies, etc. Needless to say, manually copying them to local files was not a viable option. I still had one alternative though.
Use the source code in Lido's GitHub repository!
However, could I find a public version in the repository that matched the one deployed?
If I could, then it'd be just a matter of cloning the repo and checking out the commit that matched the deployed version. But...
Well, it wasn't easy. At first I wasn't able to find the exact commit documented anywhere. I also didn't know whether the tagged versions of the repo corresponded to deployments.
As I searched, I ran into some promising branches that looked like could match. Which one should I choose?
It would've been a looong guessing game, hadn't I found Diffyscan.
Diffyscan
It advertised exactly what I needed. And it was developed by Lido devs.
How could I not give it a try?
I cloned Diffyscan's repository, and followed the setup instructions to install dependencies in my local environment.
Then I took the existing config.sample.json
file, and using the addresses of the deployed contracts, created my own config.json
.
The chosen branch (v2-upgrade-cleanup
) was just a first guess. Spoiler alert: it ended up being ok. We also tried it with v2.0.0-rc.2
, and it was ok too.
As of the dependencies, I left the default ones. Since Diffyscan comes from Lido devs, it seems they listed the exact dependencies they need for their contracts.
Next step was getting an Etherescan API token, as well as one for GitHub. Then exporting them as environment variables in ETHERSCAN_TOKEN
and GITHUB_API_TOKEN
.
Finally, I ran python3 main.py
in the root dir of Diffyscan's repository.
The result
Upon running, Diffyscan loaded the config, and started crawling the contracts.
For each address I set in the config it retrieved all files from Etherscan, tried to locate them in the repository, and diff'ed them.
In this case, it started with the contract at 0x17144556fd3424EDC8Fc8A4C940B2D04936d17eb
, and retrieved 37 files that would compare against the repository.
Once done, it logged whether the file was found or not, how many differences it spotted, and a local path to a pretty HTML digest with the actual diff.
As you can see below, it also works on the dependencies!
Then it compared the Lido-specific files.
Afterwards, it continued the process with the next address I had specified in the config.
And on and on went Diffyscan. Until it found a difference!
Diffyscan placed the HTML report in a local folder it had created, called digest
. It included subfolders for each run and address.
The HTML showed me the two files side-by-side, highlighting the exact place of the difference:
Well, false alarm! 😅
It seems the file just had an extra whitespace. For larger diffs I'd had seen bigger green and red portions of the files highlighted.
Once the whole run was done, I could see similar reports for all scanned address:
There was also a logs.txt
file including all logs, which could have been useful had I needed to move away from the terminal.
My impressions
Diffyscan is a great starting point. It's simple, fast, and most of all, gets the job done. It does require some luck though.
If you don't know which version of the GitHub repository you should use, nor cannot at least narrow down your options, then it wouldn't help much. You'd have to iterate over every possible version of the code, and compare that to the verified source code.
On top of it, the config requires you to list dependencies. Same problem as before. If you don't know the dependencies used, nor their exact versions, you're in trouble.
I think Diffyscan works great in at least two scenarios.
First, if you know both the addresses and the exact version in GitHub, it helps you verify that the code matches. This is useful to verify claims of developers, and quickly spot potential differences, if any.
And second, if you have narrowed down versions to just a few, but still unsure about which one matches, you can run the tool a few times to find the one in a couple of minutes.
Have you used Diffyscan? What are your impressions? Otherwise, if you have an alternative to diff smart contracts against mainnet, mind sharing?