appliedAI
10
content. These policies should outline
ownership of content, licensing or usage
restrictions, and provisions for protecting
sensitive information. Collaborative efforts
involving third parties should ensure
that these issues are considered during
contracting. It should be noted, however,
that there is still a great deal of uncertainty
around IP rights stemming from content
created through generative AI.
4) Security. LLMs can require the
processing of extremely sensitive
business information. Firms should
conduct a thorough risk assessment
for each use case to ex-ante identify
and address potential security issues.
For highly sensitive data it is typically
recommended to host the LLM within a
rm insular network. If this is not possible,
collaborating with reputable external LLM
providers who adhere to stringent security
standards and are transparent about
their security practices is crucial. For data
falling under the GDPR, rms must ensure
that all data is stored and processed on
servers within Europe.
5) Costs. Developing LLMs in-house is a
costly endeavor. It rst requires signicant
investment in terms of hiring a highly-
skilled workforce, including ML engineers
and NLP specialists, who tend to
command high salaries. The development
process itself is then time-consuming and
resource-intensive, involving extensive
research, data collection, model training,
and iterative improvement cycles,
all of which demand considerable
computing power and infrastructure
investment. Ongoing maintenance,
updates, licenses, and support require
continuous investment to ensure optimal
performance and reliability. Last, it is
important to consider the opportunity
costs of allocating internal resources to
LLM development over core business
activities. While in-house development
offers several benets, it diverts attention
and resources from other strategic
initiatives and potentially delays time-
to-market, which can lead to increased
opportunity costs. Executives should
therefore carefully evaluate nancial
implications and weigh costs against
potential benets before deciding to
develop LLMs in-house. Fine-tuning may
be a more suitable approach in many
cases, with substantially lower costs.
To address high development costs,
organizations could explore ways to
streamline the labeling and development
cycles. Leveraging pre-existing labeled
datasets or partnering with external
data providers can reduce the need for
extensive manual labeling, saving time
and resources. Additionally, adopting
cloud-based solutions for data storage
and processing can offer scalability and
cost-efciency, enabling organizations
to handle large volumes of data more
effectively.
6) Talent. The scarcity of experienced
professionals in elds such as data
science, ML, and NLP often make it
difcult to establish a skilled in-house
team, especially for SMEs confronted
with resource constraints. In Europe,
the competition for top talent is erce,
with SMEs and large rms alike facing
recruitment difculties and talent
shortage. Additionally, extremely
rapid development in the eld of LLMs
necessitates continuous learning and
professional development, meaning
companies should make signicant
investments in training and upskilling their
workforce. Overcoming these hurdles
requires a strategic approach that can
include fostering partnerships with
academic institutions, collaborating with
external partners, offering competitive
salaries, and creating a stimulating work
environment that promotes innovation.
Firms already confronted with talent
scarcity may decide to source their LLM
solutions from the market to save direct
and indirect talent-related costs and to
utilize their talent resources for other
projects. In-house ne-tuning models
often constitute a middle course that
can strike a balance between acquiring
off-the-shelf products and developing
models from scratch.
7) Legal expertise. Developing LLMs in-house
requires rms to seek legal expertise
to navigate an increasingly complex
regulatory landscape. For instance,
the proposed EU AI Act, which focuses
on preventing harm to health, safety,
and fundamental human rights, would
involve a risk-based approach whereby AI
systems would be assigned to a risk class.
High-risk systems such as LLMs would
need to meet stricter requirements than
low-risk systems. Firms pursuing in-house