license-generator

Agent claude-code Type Functional Date 5/14/2026, 7:46:40 PM
baselinelocal
Success Rate 83%100%
Tokens (avg) 2K
avg total
1K
avg total
Time (avg) 30s25s
Eval results
#PromptDetails
3Generate a license file for John Doe.
baselinelocal
Success Rate 50%100%
Tokens (avg) 2K1K
Time (avg) 39s25s
Expectationbaselinelocal
A file named LICENSE was created in the working directory50%1 / 2100%2 / 2
Judge per trial · baseline
Trial 1The agent created a file named 'LICENSE-john-doe' rather than 'LICENSE'. A pre-existing 'LICENSE' file (for Federico Mete) exists but was not created by this agent in response to the prompt; the agent intentionally avoided overwriting it. The assertion requires a file named exactly 'LICENSE' to have been created for John Doe, which did not happen.
Trial 2The LICENSE file exists in the working directory. While it was modified (not created from scratch), the diff shows it was updated to reflect the new license content for John Doe, satisfying the intent of the assertion.
Judge per trial · local
Trial 1The LICENSE file exists in the working directory. While the diff shows a modification rather than a new file creation, the file is present and contains the expected license content.
Trial 2The LICENSE file exists in the working directory and was modified to reflect the new copyright holder, satisfying the intent of the assertion.
The LICENSE file contains 'John Doe'50%1 / 2100%2 / 2
Judge per trial · baseline
Trial 1The existing 'LICENSE' file contains 'Federico Mete', not 'John Doe'. The agent wrote 'John Doe' content into 'LICENSE-john-doe' instead of the LICENSE file.
Trial 2Line 3 of the LICENSE file reads 'Copyright (c) 2026 John Doe', confirming the name is present.
Judge per trial · local
Trial 1Line 3 of the LICENSE file contains 'Copyright (c) 2026 John Doe', confirming the presence of 'John Doe'.
Trial 2Line 3 of the LICENSE file reads 'Copyright (c) 2026 John Doe', confirming the presence of 'John Doe'.
The LICENSE file contains the current year (2026) since no year was specified in the prompt50%1 / 2100%2 / 2
Judge per trial · baseline
Trial 1While the 'LICENSE' file does contain '2026', it is for Federico Mete and was pre-existing, not generated for John Doe per the prompt. The license generated for John Doe was written to 'LICENSE-john-doe', not 'LICENSE'.
Trial 2Line 3 of the LICENSE file shows the year 2026, which matches the current year.
Judge per trial · local
Trial 1Line 3 of the LICENSE file contains '2026', which is the current year.
Trial 2Line 3 of the LICENSE file contains '2026', which matches the current year.
1Create an MIT license for Globex Corporation, year 2023.
baselinelocal
Success Rate 100%100%
Tokens (avg) 1K1K
Time (avg) 26s26s
Expectationbaselinelocal
A file named LICENSE was created in the working directory100%2 / 2100%2 / 2
Judge per trial · baseline
Trial 1The LICENSE file exists in the working directory. Although it was modified rather than newly created (it pre-existed), the file is present and contains valid MIT license content for the requested entity.
Trial 2The LICENSE file exists in the working directory. While the diff shows it was modified rather than newly created, the file is present and contains the requested MIT license content.
Judge per trial · local
Trial 1The LICENSE file exists in the working directory. While the diff shows it as modified rather than newly created, the file is present and contains the required MIT license content for the requested entity.
Trial 2The LICENSE file exists in the working directory. While the diff shows it was modified rather than newly created, the file is present and contains the expected MIT license content for the specified entity.
The LICENSE file contains the line 'MIT License'100%2 / 2100%2 / 2
Judge per trial · baseline
Trial 1Line 1 of the LICENSE file is exactly 'MIT License'.
Trial 2Line 1 of the LICENSE file contains exactly 'MIT License'.
Judge per trial · local
Trial 1Line 1 of the LICENSE file contains exactly 'MIT License'.
Trial 2Line 1 of the LICENSE file contains exactly 'MIT License'.
The LICENSE file contains 'Copyright (c) 2023 Globex Corporation'100%2 / 2100%2 / 2
Judge per trial · baseline
Trial 1Line 3 of the LICENSE file contains exactly 'Copyright (c) 2023 Globex Corporation', matching the diff and verified by reading the file.
Trial 2Line 3 of the LICENSE file contains exactly 'Copyright (c) 2023 Globex Corporation'.
Judge per trial · local
Trial 1Line 3 of the LICENSE file contains exactly 'Copyright (c) 2023 Globex Corporation'.
Trial 2Line 3 of the LICENSE file contains exactly 'Copyright (c) 2023 Globex Corporation'.
2Add a license to this project. The owner is Initech Ltd, 2025.
baselinelocal
Success Rate 100%100%
Tokens (avg) 1K1K
Time (avg) 24s23s
Expectationbaselinelocal
A file named LICENSE was created in the working directory100%2 / 2100%2 / 2
Judge per trial · baseline
Trial 1The LICENSE file exists in the working directory. While it was modified rather than newly created (it pre-existed), the file is present and contains the required license content as requested.
Trial 2A LICENSE file exists in the working directory. While it pre-existed and was modified rather than newly created, the file is present and satisfies the assertion that a LICENSE file exists.
Judge per trial · local
Trial 1The LICENSE file exists in the working directory. While it was modified rather than newly created (per the diff), the file is present and contains the required license content.
Trial 2A file named LICENSE exists in the working directory. While the diff indicates the file was modified rather than newly created, the assertion's intent (the presence of a LICENSE file) is satisfied.
The LICENSE file contains the line 'MIT License'100%2 / 2100%2 / 2
Judge per trial · baseline
Trial 1Verified by reading the file: line 1 contains 'MIT License'.
Trial 2Line 1 of the LICENSE file contains exactly 'MIT License'.
Judge per trial · local
Trial 1Line 1 of the LICENSE file is exactly 'MIT License'.
Trial 2Line 1 of the LICENSE file is exactly 'MIT License'.
The LICENSE file contains 'Copyright (c) 2025 Initech Ltd'100%2 / 2100%2 / 2
Judge per trial · baseline
Trial 1Verified by reading the file: line 3 contains exactly 'Copyright (c) 2025 Initech Ltd'.
Trial 2Line 3 of the LICENSE file contains exactly 'Copyright (c) 2025 Initech Ltd', matching the assertion.
Judge per trial · local
Trial 1Line 3 of the LICENSE file contains exactly 'Copyright (c) 2025 Initech Ltd'.
Trial 2Line 3 of the LICENSE file contains exactly 'Copyright (c) 2025 Initech Ltd'.