Analysis of Small Language Model
Applications for Data Insights
Project Plan
Executive Summary
The project focuses on enhancing the AGCUMEN platform, a specialised web-based
solution developed by SONEAN for the agriculture and technology sectors.
AGCUMEN currently spans more than 190 countries and includes over 60,000
organisations, delivering insights across the entire agriculture value chain, including
policymakers, business entities, investors, and other agriculture-related businesses. The
existing platform provides relevant information through complex parameter selection
in a matter of minutes, significantly reducing research time and costs for clients.
To enhance AGCUMEN’s capabilities, our team has been tasked with analysing and
selecting a suitable Small Language Model (SLM) integrated with Natural Language
Processing (NLP) techniques. The primary objective is to assess and recommend a user-
friendly system that can deliver instant answers in natural language improving response
Commented [GG1]: In natural language
times from minutes to seconds. The project scope includes finding the right model for
the requirement, demonstrating the working method by using a prototype model, and
providing user training documents. The SLM model will be accessible and easy to use,
so that even users with minimal IT knowledge can interact with the system. This project
excludes cloud support, third-party integration, and compatibility with other datasets or
languages.
The project involves managing internal, external, and technical risks with careful
consideration of scope alignment and communication plans with stakeholders. Roles
and responsibilities are shared among team members to ensure successful project
outcomes.
The final deliverables will include a comprehensive project report and a prototype in
Commented [GG2]: The deliverables must include a
form of a Small Language Model to demonstrate on how it works. The report will prototype in form of a Small Language Model and a
include our requirement analysis, research findings, model selection, demonstration on how it works. The report will include
findings of your research on SLMs and how to setup a
recommendations, training materials on hot to set a selected SLMs and continues
selected SLMs and continues maintenance of it.
maintenance of it, and a project presentation, all aimed at providing a solution for the
client’s requirements. The project will be evaluated based on the quality of research,
timeliness, and client satisfaction of the deliverables.
Project Background and Description
Overview of the Host Organisation
SONEAN is an "Ecosystem Intelligence" firm based near Frankfurt, Germany, which
provides global decision-makers with connected and dynamic insights into their
operational ecosystems. Founded in 2015, SONEAN initially offered customized
solutions to monitor global opportunities and threats. In 2022, the company launched
AGCUMEN, a web-based Intelligence as a Service platform tailored for the agriculture
sector. This service delivers structured, timely intelligence to various stakeholders in
agriculture, including corporates, policymakers, and investors, and includes customized
updates, special reports, and strategic support.
Context of the Organisation
SONEAN specializes in Ecosystem Intelligence, providing actionable and connected
insights through a network-based approach. Their innovative service enables clients to
monitor industry opportunities and threats dynamically, reducing research time and
costs by over 95%. With a blend of machine-based and human oversight, SONEAN
delivers fast, high-quality intelligence, helping clients to make better-informed
decisions and achieve significant cost savings.
Project Description
AGCUMEN, SONEAN’s specialised platform for agriculture and technology, provides
a comprehensive web-based solution that offers dynamic, connected intelligence across
the entire agricultural value chain. This platform currently allows users to navigate
through a highly structured dataset, using a dashboard interface to select parameters
and retrieve relevant information. To further enhance AGCUMEN's capabilities,
SONEAN is collaborating with the Industrial AI Research Centre at UniSA on a joint
student project. The focus of this project is to research a Small Language Model (SLM)
integrated with Natural Language Processing (NLP) techniques. The goal is to create a
solution that provides a user-friendly, interactive system that delivers instant, accurate
answers to user queries from the given dataset. This new approach aims to significantly
reduce response times by replacing the traditional dashboard parameter selection with
a question-based query system, allowing users to retrieve information quickly and
efficiently. The recommended model will provide users with timely insights, thereby
improving their overall experience with the AGCUMEN platform.
Importance of the Project to the Organisation
The AGCUMEN platform provides comprehensive ecosystem intelligence, linking
over 60,000 core organisations across 190+ countries and analysing more than 60
million signals daily from over 50 languages. Currently, AGCUMEN delivers insights
in minutes, significantly reducing research time by over 95% and saving clients
significant real and opportunity costs.
The development of a Small Language Model (SLM) application aims to further
enhance this capability by providing answers in a matter of seconds. This improvement
will not only accelerate information retrieval but also strengthen AGCUMEN’s value
proposition, offering users an even faster, more efficient way to access critical insights.
By integrating SLM methods, SONEAN will solidify its position as a leader in
ecosystem intelligence, driving greater client satisfaction, operational efficiency, and
competitive advantage.
Objectives of the Project
The primary objective of this project is to analyse and provide a comprehensive study
report on Small Language Model (SLM) and recommend suitable models to meet the
customer’s needs.
The major objective aims to:
• Understand the client requirements: Through meeting and business study,
understanding the current business model and requirement to meet the future
needs.
• Research on SLM Models: Perform detailed research on available SLM
models.
• Model Selection: Choosing the model to support the business need and align
with project scope.
• Feasibility Study: The chosen model will be analyzed to determine whether it is
technically, financially, and operationally viable.
• Recommendation: Provide recommendations based on all findings.
• Proof of Concept: Demonstrate the model workflow using a prototype or
available examples.
• Training Material: The complete guide will be prepared on how to configure
the model for real-time use.
Associated Benefits
Enhancing Information Retrieval: Reduce the time required to access critical insights
from minutes to mere seconds, thereby improving the overall efficiency and
effectiveness of the AGCUMEN platform.
Information Handling Potential: The integration of SLM technology will enable the
AGCUMEN platform to manage larger volumes of data and handle more complex
queries. It will also support expansion into additional industries and sectors.
Project Scope
Inclusion
The following aspects of the project are considered within scope:
Comprehensive Research: A detailed study on available SLM models and providing
complete analysis report, research finding, recommendation.
Proof of Concept: The Proof of Concept will involve developing a prototype to
showcase the application's workflow, initially focusing on a single model. This
prototype will support English, use a clean dataset provided by the client for training
and testing, and be designed to run efficiently on a local machine. The prototype is
limited to one model. Based on the need, developing further prototypes will be
considered.
Training Material: The training material will be prepared to help users understand
how to configure the model.
Timeline: The project is expected to take a total of eleven weeks to complete.
Exclusions
The following aspects of the project are explicitly out of scope:
User Interface: UI design and development will not be the part of this project.
Compatibility with other languages: The model will not have the capability to
support any languages other than English.
Cloud Support: The model will be evaluated for compatibility with cloud platforms,
but no actions will be taken to demonstrate it at the cloud level.
Model performance on other datasets: The prototype will be trained and evaluated
using the given dataset, no assurance can be given regarding its performance with other
datasets, and data cleaning for new dataset will not be performed.
Change Management: No additional tasks will be carried out for planning the
migration strategies from the current web-based dynamic dashboards to the SLM-based
application.
Third Party Integration: Exploration of third-party support for the SLM model will
not include in this project.
Risks
The risk associated with the project can be categorized into internal, external, technical
challenges.
Internal and External Risks: These may arise from within the team, including
misalignment in understanding the project scope, potential changes in scope, team
coordination issues, client expectations, communication gaps and changes in client
requirements.
Technical Risk: Limitation and challenges related to technology, such as
implementation and integration.
Understanding and addressing these risk are crucial and they are clearly identified and
discussed below,
Probability
Risk Impact Mitigation Strategy
(out of 1.0)
The Proof of Concept – The If the first prototype fails to
prototype may not function
function, alternative models will be
as expected with the given 0.5 High
considered for developing a new
dataset to demonstrate the
workflow prototype. If time constraints
prevent developing a new
prototype or if all prototypes fail,
the functionality will be explained
using other available resources as
examples.
Technical limitation for
completing the project –
May be unable to complete Model selection criteria will be
the task due to a lack of 0.4 High used to finalise the model for
training material, or prototype development.
insufficient training
material
Regular meetings will be
Team Coordination and scheduled for team engagement,
0.3 Medium
Communication and any issues will be identified
early and discussed with mentors.
A cleaned and structured sample
Dataset Quality for
0.3 Medium dataset will be used for training
Training and Evaluation
and evaluation.
The project scope will be defined
and documented. Each project
Misunderstanding of
0.2 High stage will be reviewed and
Project Scope
validated with the mentor,
supervisor, and client.
Model will perform in local
Data Privacy 0.1 High desktop. Relevant stakeholders
will ensure data privacy.
Budget
Budget Item Estimated Cost Justification
The project staffing costs are calculated based on
the hourly rates and total hours worked for each
Team Member Time
role. Assuming minimum wages,
$59,200
(Labor) Project Manager - $60 * 320 hours= $19,200.
Requirement Analyst = $45 * 320 hours =
$14,400.
Quality Analyst * 2 Staff = $40 * 320 hours =
$12,800 *2 = $25,600
The project utilises free or academic-licensed
Software Licenses software tools for researching on AI model,
$0
(AI/ML Tools) prototype development, eliminating the need for
additional expenses.
The project leverages existing personal or
Hardware university-provided hardware, with no additional
$0
(Local Desktop Setup) costs incurred for hardware acquisition or
upgrades.
We will use customer-provided datasets,
Data Acquisition /
$0 avoiding costs related to data acquisition or
Subscription Fees
subscription fees.
Total Estimated Cost: $ $59,200
Communication Plan
The communications plan is crucial for ensuring smooth and effective collaboration
among the project team, client, mentor, and other stakeholders. The purpose of this plan
is to outline the agreed-upon methods, frequency, and objectives of communication
throughout the project period.
Communication with Client and Project Supervisor
Purpose: To update the client and Project Supervisor on project progress, gather
feedback, and ensure alignment with their expectations.
Frequency: Bi-weekly
Method:
• Formal Meetings (via Zoom/Teams)
• Email for updates, deliverables and questions.
Details:
• Bi-weekly Meetings: A progress review meeting will be held every two weeks to
discuss project status, challenges, and next steps.
• Emails: The project supervisor will send updates to the client, including
deliverables, progress reports, and any critical issues that require
feedback.Communication with Project Mentor
Purpose: To seek guidance, receive feedback on research findings, technical and
strategic aspects of the project, and ensure alignment with best practices.
Frequency: Weekly check-ins
Method:
• One-on-one meeting.
• Teams and email for updates and documentation review
Details:
• Weekly Check-ins: Scheduled meetings to discuss project progress, technical
challenges, and obtain the mentor’s insights.
• Emails: Used for sending updates, seeking feedback on key decisions, and sharing
project documentation for review.
Communication with Team Members
Purpose: To coordinate tasks, ensure progress, discuss research findings and challenges,
and keep the team aligned on project objectives.
Frequency: Daily discussion and weekly detailed meetings
Method:
• Daily stand-ups (via WhatsApp/Teams)
• Weekly meetings (In person / Teams)
Details:
• Daily Discussion: Short meetings to discuss progress, share knowledge, and plan
the next step.
• Weekly Meetings: Detailed discussions on progress, task allocation, and ensuring
the alignment with the project scope.
Deliverables and Project Evaluation Criteria
The following will be delivered to the client at the end of the project:
Comprehensive Report: A detailed report on research findings, model selection,
recommendations, user training will be documented and delivered.
Project Presentation: Key findings, results, and recommendations will be explained
to stakeholders.
Proof of Concept: A prototype will be developed and demonstrated to showcase
application performance. Alternatively, other existing similar application may be
demonstrated to aid understanding.
Project Evaluation Criteria:
Research on SLM Model: The research findings and model recommendations will
meet all the customer's requirements.
User Satisfaction: Feedback from the client on the quality work and overall outcome.
Timeliness: Milestones and final deliverables must be achieved effectively and
delivered on time within the agreed deadlines.
Implementation Plan
The implementation plan for the model development consists of several stages.
a) Project Initiation Phase: This phase involves a meeting with stakeholders (Team
members, mentor, project supervisor, client) to discuss the business background, needs
and define the project scope.
b) Planning Phase: In this phase, we will develop a project plan report, which will
explain the scope, limitations, risk associated with this project, roles and responsibilities
of the stakeholders, and our approach,
c) Execution Phase:
Researching on SLM Models: The project execution phase begins with researching.
Based on the requirement analysis outcome, a detailed study will be conduct on
available SLM model.
Model selection: After detailed research on available models, the selection will be
based on criteria defined by client requirements and the future scope of this project.
Developing the Prototype: The chosen model will be developed into prototype for
demonstration purpose.
d) Monitoring Phase:
Testing and Evaluation: The developed prototype will be tested and evaluated based
on its performance.
Documentation: Information about the model, including training and evaluation
results, as well as complete guidelines, will be documented.
e) Closure Phase:
Presentation and Demonstration: The project research work and recommendation
will be presented, and documents delivered to the client.
Project Handover: The comprehensive research report and training material will be
handed over to the client.
Project Closure and Evaluation: The project will be evaluated by the client, project
supervisor and mentor.