SAFURAI-001: A New Qualitative Approach for Code LLM Evaluation (with arXiv Link)

Leonardo Boiardi • 2023-09-21

Check out our first publication on arXiv! We present two new innovations made by Safurai: Safurai-001 Model and GPT4-based MultiParameters.

“A samurai wearing a black jacket working as a chemist, digital 2d art” - Made with Midjourney

At Safurai, we've always believed in pushing the limits — in challenging the norms and transforming the future of AI technology. In continuing this tradition, we are thrilled to introduce our research in the field of Large Language Models (LLM) and coding assistance.

LINK TO THE PAPER: https://arxiv.org/abs/2309.11385

First Innovation: SAFURAI-001

Our team's dedication, in-depth study, and relentless pursuit of excellence have resulted in a new model – Safurai-001, a more “helpful” AI Coding LLM.

The research and development of our Large Language Model are deeply rooted in the most recent advances in the field. The idea behind the creation of Safurai-001 was to come up with a model that performs similarly to the latest models like WizardCoder, PanguCoder, and Phi-1. However, our core differentiator is that Safurai-001 challenges the status quo by adding a layer of unique conversational interaction. Safurai-001 goes beyond being just conversational and truly aims to be more helpful in assisting with coding tasks.

A key part of our development process was instruction tuning and prompt engineering, without relying on highly curated data that perform well on benchmarks as done by WizardCoder, PanguCoder, and Phi-1. By taking a different approach, we ensure that our model is not biased towards specific benchmarks or influenced. This allows Safurai-001 to offer a fresh and innovative perspective in the field.

Thanks to its training, based on the most recent prompt engineering techniques for data transformation (CoT, ToT, etc.), developers can expect a coding assistance model with Safurai-001 that not only understands their instructions but also assists them more effectively.

Second Innovation: GPT4-based MultiParameters Evaluation

In our journey to innovation, we realized that there was a critical void – the lack of a substantial metric to evaluate the performance of coding LLMs effectively. This realization inspired us to develop the GPT4-based MultiParameters, an evaluation benchmark that effectively gauges different parameters to provide a complete insight into the model's functioning and performance.

Our assessments, based on these metrics, paint a promising picture for Safurai-001. According to these benchmarks, Safurai-001 has shown exemplary results, outperforming GPT-3.51 by 1.58% and leading WizardCoder by a whopping 18.78% in the Code Readability parameter.

While these numbers are incredibly promising, our journey with Safurai-001 is far from over. We believe that every step in this journey opens up a new dimension of possibilities and opportunities to enhance and refine our LLM. Our core focus remains to continuously improve Safurai-001 and make it the most reliable, powerful tool for coding assistance.

LINK TO THE PAPER: https://arxiv.org/abs/2309.11385

At Safurai, we are pioneers charting the unexplored territories of AI and coding assistance. With Safurai-001, we're forging paths to tomorrow, creating meaningful technology that empowers and brings value to our users. Stay tuned for more exciting updates from our continuous journey of learning, inventing, and innovating. 🥷

See More Posts

SAFURAI-CSHARP: Harnessing Synthetic Data to improve Language-specific Code LLM

SAFURAI-001: A New Qualitative Approach for Code LLM Evaluation (with arXiv Link)

Safurai's Commitment to GDPR and AI Act Compliance

Safurai

Safurai is the AI Code Assistant designed to revolutionize the way you code.

[email protected]

Terms and conditions | Privacy Policy