Blockchain security firm OpenZeppelin recently attempted to test whether artificial intelligence (AI) tools could replace human auditors in smart contract auditing. Using OpenAI’s ChatGPT-4, the firm’s Mariko Wakabayashi and Felix Wegener pitted the AI model against their Ethernaut security challenge. Ethernaut is a wargame consisting of 28 smart contracts, or levels, to be hacked in the Ethereum Virtual Machine. To win, players must find the correct exploit for each level.
ChatGPT-4 managed to find the exploit and pass 20 of the 28 levels. However, it struggled with newer ones introduced after its September 2021 training data cutoff date. The plugin enabling web connectivity was not included in the test. OpenZeppelin expects its auditors to complete all Ethernaut levels, meaning ChatGPT-4 is currently incapable of replacing human auditors.
Nevertheless, according to Wakabayashi and Wegener, AI can still be utilized to improve the efficiency of smart contract auditors and detect security vulnerabilities. They noted that an AI model trained using tailored data and output goals could potentially deliver more reliable solutions than chatbots trained on large amounts of data.
While AI tools may increase the efficiency of human auditors, Wegener explained that the total demand for smart contract audits outweighs the capacity to provide high-quality audits. This is expected to continue, leading to a growing number of people employed as auditors in the Web3 space.
Large language models (LLMs) such as ChatGPT-4 may not be ready for smart contract security auditing as they require a high level of precision. LLMs are currently optimized for generating text and having human-like conversations, which may not deliver the precision necessary for smart contract auditing. AI model iterations with better training and data could yield more reliable results in the future.