Summary

OpenAI에서는 Microsoft Azure AI 플랫폼을 Oracle Cloud Infrastructure (OCI)로 확장하는 파트너십을 발표했습니다. META는 모바일 기기에 최적화된 대규모 언어 모델인 MobileLLM을 소개했습니다. Synthesia는 AI 비디오 커뮤니케이션 플랫폼 Synthesia 2.0을 발표했으며, Anthropic은 Claude 콘솔에서 새로운 프롬프트 생성 및 평가 기능을 도입했습니다. Microsoft는 긴 문맥 LLM을 위한 MInference 1.0을 발표했습니다. THUMD는 다중 언어 코드 생성 모델 CodeGeeX4-ALL-9B를 소개했습니다. BAAI는 인코더 없는 비전-언어 모델 EVE를 공개했습니다. Cohere는 다중 언어 환경에서의 RLHF 최적화를 연구한 논문을 발표했습니다. 또한, BNP Paribas는 Mistral AI와 파트너십을 맺었습니다.

OpenAI와 Oracle의 파트너십

OpenAI, Oracle Cloud Infrastructure를 통해 Microsoft Azure AI 플랫폼 확장

링크, 2024년 6월 11일,

  • OpenAI는 Oracle, Microsoft와 협력하여 Microsoft Azure AI 플랫폼을 Oracle Cloud Infrastructure (OCI)로 확장
  • OCI는 Azure 플랫폼을 확장하고 OpenAI가 지속적으로 확장할 수 있도록 지원
  • OCI의 AI 인프라는 전 세계 수천 개의 AI 혁신가들이 사용 중
  • OCI의 목적 맞춤형 AI 기능은 스타트업과 기업이 모델을 더 빠르고 신뢰성 있게 구축하고 훈련할 수 있도록 지원
  • Oracle의 Gen2 AI 인프라는 세계에서 가장 빠르고 비용 효율적인 AI 인프라로 인정받음
  • Oracle의 창립자 겸 CTO인 Larry Ellison은 “세계에서 가장 위대한 대규모 언어 모델을 구축하는 경쟁이 진행 중이며, 이는 Oracle의 Gen2 AI 인프라에 대한 무한한 수요를 촉진하고 있다”고 언급
  • OpenAI CEO Sam Altman은 “Microsoft와 Oracle과 협력하게 되어 기쁘다. OCI는 Azure 플랫폼을 확장하고 OpenAI가 계속 확장할 수 있도록 할 것”이라고 언급

META의 MobileLLM

META, MobileLLM: 최적화된 소규모 언어 모델

링크, 2024년 7월 9일,

  • META는 클라우드 비용과 지연 문제를 해결하기 위해 모바일 기기에 최적화된 대규모 언어 모델을 설계
  • MobileLLM은 10억 개 미만의 파라미터로 고품질 모델을 제공
  • 깊고 얇은 아키텍처, 임베딩 공유, 그룹화된 쿼리 주의 메커니즘을 활용
  • MobileLLM-LS는 MobileLLM 모델보다 추가적인 정확도 향상
  • MobileLLM 모델군은 이전 소규모 모델들에 비해 대화형 벤치마크에서 유의미한 성능 향상
  • MobileLLM은 LLaMA-v2 7B와 비슷한 API 호출 작업에서 뛰어난 성능을 발휘
  • 새로운 모델은 모바일 장치에서 효율적으로 작동하도록 설계되어 클라우드 비용 절감 및 지연 문제를 해결

Synthesia 2.0 발표

Synthesia, 세계 최초 AI 비디오 커뮤니케이션 플랫폼

링크, 2024년 6월 25일,

  • Synthesia 2.0은 텍스트, PPT, PDF, URL을 몇 분 만에 비디오로 변환
  • 새로운 Personal AI 아바타 및 차세대 AI 아바타 공개 예정
  • AI Video Assistant는 전체 지식 베이스를 비디오 라이브러리로 변환
  • 새로운 AI Screen Recorder를 통해 화면 녹화를 비디오 프레젠테이션으로 전환
  • ISO/IEC 42001 인증을 목표로 AI 안전성 강화
  • 개인 AI 아바타는 스튜디오에서 고해상도 카메라로 촬영된 Expressive Avatar와 자연 배경에서 웹캠이나 휴대폰으로 촬영된 Custom Avatar 두 가지 방식으로 생성 가능
  • AI Screen Recorder는 화면 녹화를 비디오 프레젠테이션으로 변환하여 고품질 비디오를 쉽게 업데이트 가능
  • 1-클릭 번역 기능으로 비디오를 120개 이상의 언어로 자동 번역
  • 새로운 비디오 플레이어는 개인화된 실시간 상호작용 경험 제공
  • 안전한 AI 개발 및 사용을 위해 AI 안전성을 핵심으로 삼고 ISO/IEC 42001 인증을 목표로 함

Anthropic 콘솔 프롬프트 평가 기능

Anthropic, 프롬프트 생성 및 평가 기능 추가

링크, 2024년 7월 10일,

  • Anthropic 콘솔에서 프롬프트 생성, 테스트 및 평가 기능 도입
  • Claude 3.5 Sonnet을 활용한 프롬프트 생성 기능 제공
  • 테스트 케이스 자동 생성 및 비교 기능 추가
  • 프롬프트의 품질을 개선하기 위한 새로운 기능 제공
  • 사용자가 Claude 3.5 Sonnet을 통해 프롬프트를 생성하고 테스트 케이스를 자동으로 생성하여 Claude의 응답을 확인 가능
  • Evaluate 기능을 통해 다양한 실제 입력에 대해 프롬프트를 테스트하여 품질을 확인하고 배포 전에 신뢰성을 구축 가능
  • 테스트 케이스를 수동으로 추가하거나 CSV에서 가져올 수 있으며, Claude를 사용하여 자동으로 생성 가능
  • 테스트 케이스를 수정하고 한 번의 클릭으로 모든 테스트 케이스를 실행하여 결과를 비교 가능
  • 주제 전문가가 응답 품질을 5점 척도로 평가하여 응답 품질을 개선할 수 있음

Microsoft MInference 1.0 발표

Microsoft, 긴 문맥 LLM을 위한 MInference 1.0

링크, 2024년 7월 7일,

  • MInference 1.0은 동적 희소 주의를 활용하여 긴 문맥 LLM의 사전 채우기 속도 향상
  • A100 GPU에서 최대 10배 속도 향상
  • LLaMA-3-8B-1M, GLM-4-1M과 같은 모델을 지원
  • ICML’24에서 발표 예정
  • MInference 1.0은 LLM의 주의 메커니즘에서 정적 패턴을 활용하여 사전 채우기 속도를 높임
  • 각 헤드의 희소 패턴을 오프라인에서 결정한 후, 희소 인덱스를 온라인에서 근사화하여 최적의 커널로 주의를 동적으로 계산
  • 긴 문맥 LLM의 사전 채우기 속도를 최대 10배까지 향상시키면서 정확도를 유지

CodeGeeX4-ALL-9B 모델 소개

THUDM, CodeGeeX4: 다중 언어 코드 생성 모델

링크, 2024년 7월 5일,

  • CodeGeeX4-ALL-9B는 ChatGLM 9B를 기반으로 한 다중 언어 코드 생성 모델
  • 코드 완성, 코드 생성, 코드 해석, 웹 검색, 함수 호출, 저장소 수준의 코드 Q&A 지원
  • BigCodeBench 및 NaturalCodeBench에서 높은 성능 달성
  • CodeGeeX4-ALL-9B는 CodeLlama 70B와 경쟁하며, DeepSeek Coder 33B와 비슷한 성능을 발휘
  • 최대 128K 컨텍스트를 지원하며, 다양한 소프트웨어 개발 시나리오에서 활용 가능
  • 다중 언어를 지원하여 글로벌 소프트웨어 개발자 커뮤니티에 유용

EVE 비전-언어 모델 발표

BAAI, 인코더 없는 비전-언어 모델 EVE 공개

링크, 2024년 6월 17일,

  • EVE는 비전 인코더 없이 비전-언어 입력을 수용하는 모델
  • 단일 디코더를 사용하여 비전-언어 표현을 통합
  • 35M의 공개 데이터만으로도 유사한 용량의 인코더 기반 VLM과 경쟁
  • 여러 비전-언어 벤치마크에서 높은 성능 달성
  • EVE는 비전-언어 모델의 훈련 및 전송 효율성을 높이기 위한 새로운 훈련 레시피 제공
  • 비전 인코더를 사용하지

않음으로써 모델의 유연성과 효율성 증대

  • EVE는 Fuyu-8B 모델보다 성능이 우수하며, 투명하고 효율적인 모달리티 간 디코더 아키텍처 제공

다중 언어 환경에서의 RLHF 최적화

Cohere, RLHF 다중 언어 최적화 연구 발표

링크, 2024년 7월 3일,

  • Cohere는 다중 언어 환경에서 RLHF 최적화를 연구
  • 50K개의 영어 프롬프트를 22개 언어로 번역하여 다중 언어 피드백 데이터 생성
  • Aya 23 8B 모델을 사용하여 RLHF DPO 및 RLOO 방법 비교
  • RLOO 방법이 DPO 방법보다 언어 전이 능력이 더 뛰어남
  • 다중 언어 피드백 데이터를 통해 RLHF 최적화 수행
  • 50K개의 영어 프롬프트를 22개 언어로 번역하여 다중 언어 피드백 데이터 생성
  • 5개 언어로 훈련한 모델이 보지 못한 언어에서도 성능이 19% 향상
  • RLOO 방법이 DPO 방법보다 평균 승률에서 10.6% 우수
  • 데이터 양이 증가함에 따라 DPO는 성능이 향상되지만 RLOO는 그렇지 않음

BNP Paribas와 Mistral AI의 파트너십

BNP Paribas, Mistral AI 모델 파트너십

링크, 2024년 7월 10일,

  • BNP Paribas와 Mistral AI는 Mistral AI의 상업적 모델을 사용하는 다년간의 파트너십을 체결
  • BNP Paribas는 고객 지원, 영업, IT 등 여러 비즈니스 라인에서 Mistral AI의 모델을 활용
  • Mistral AI의 에너지 효율적인 모델을 통해 확장 가능성 제공
  • 협력을 통해 금융 서비스의 미래를 재정의할 혁신적인 사용 사례 개발
  • BNP Paribas는 Mistral AI의 대규모 언어 모델을 활용하여 여러 비즈니스 라인에서 다양한 사용 사례를 개발 중
  • Mistral AI의 모델은 높은 에너지 효율성과 확장 가능성을 제공하여 규제 기관의 요구를 충족함
  • 협력을 통해 금융 서비스의 고객 지원, 판매, IT 등의 분야에서 혁신적인 사용 사례를 개발할 계획
  • BNP Paribas의 COO Sophie Heller는 “고품질 가상 비서 등을 통해 고객의 질문에 24/7 응답하고 프로세스를 간소화할 것”이라고 언급
Sources

This GPT assists users by creating a detailed daily newspaper in Korean based on provided links. It follows these steps: read the content, summarize each content with detailed points, and write a report. The report format is:

(today’s date in 년 월 일) AI 소식,

Summary

(overall short summary, make summary with good details. for Summary section, explain the details starting with company name, e.g. OpenAI에서는 ~~~를 발표하였습니다.)

Title,

company name, 제목

링크, date,

  • detailed summary1, (개조식 문체 사용)
  • detailed summary2, (개조식 문체 사용)
  • detailed summary N, (개조식 문체 사용)

Title,

company name, 제목

링크, date,

  • detailed summary1, (개조식 문체 사용)
  • detailed summary2, (개조식 문체 사용)
  • detailed summary N, (개조식 문체 사용)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
###
https://www.oracle.com/news/announcement/openai-selects-oracle-cloud-infrastructure-to-extend-microsoft-azure-ai-platform-2024-06-11/?source=:so:li:or:awr:ocorp:::&SC=:so:li:or:awr:ocorp:::&pcode=
Press Release
OpenAI Selects Oracle Cloud Infrastructure to Extend Microsoft Azure AI Platform
Austin, Texas—Jun 11, 2024
Oracle and OpenAI
Oracle, Microsoft, and OpenAl are partnering to extend the Microsoft Azure Al platform to Oracle Cloud Infrastructure (OCI) to provide additional capacity for OpenAl.

OpenAI is the AI research and development company behind ChatGPT, which provides generative AI services to more than 100 million users every month.

“We are delighted to be working with Microsoft and Oracle. OCI will extend Azure’s platform and enable OpenAI to continue to scale,” said Sam Altman, Chief Executive Officer, OpenAI.

“The race to build the world’s greatest large language model is on, and it is fueling unlimited demand for Oracle’s Gen2 AI infrastructure,” said Larry Ellison, Oracle Chairman and CTO. “Leaders like OpenAI are choosing OCI because it is the world’s fastest and most cost-effective AI infrastructure.”

OCI’s leading AI infrastructure is advancing AI innovation. OpenAI will join thousands of AI innovators across industries worldwide that run their AI workloads on OCI AI infrastructure. Adept, Modal, MosaicML, NVIDIA, Reka, Suno, Together AI, Twelve Labs, xAI, and others use OCI Supercluster to train and inference next-generation AI models.

OCI’s purpose-built AI capabilities enable startups and enterprises to build and train models faster and more reliably anywhere in Oracle’s distributed cloud. For training large language models (LLMs), OCI Supercluster can scale up to 64k NVIDIA Blackwell GPUs or GB200 Grace Blackwell Superchips connected by ultra-low-latency RDMA cluster networking and a choice of HPC storage. OCI Compute virtual machines and OCI’s bare metal NVIDIA GPU instances can power applications for generative AI, computer vision, natural language processing, recommendation systems, and more.


Additional Resources
Learn more about Oracle Cloud Infrastructure
Learn more about OCI AI Infrastructure
Learn more about OCI Generative AI
Contact Info
Carolin Bachmann
Oracle PR
carolin.bachmann@oracle.com
+1.415.622.8466
About Oracle
Oracle offers integrated suites of applications plus secure, autonomous infrastructure in the Oracle Cloud. For more information about Oracle (NYSE: ORCL), please visit us at www.oracle.com.

Trademarks
Oracle, Java, MySQL and NetSuite are registered trademarks of Oracle Corporation. NetSuite was the first cloud company—ushering in the new era of cloud computing.

###
https://github.com/facebookresearch/MobileLLM
7/9/24
META
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases
Zechun Liu, Changsheng Zhao, Forrest Iandola, Chen Lai, Yuandong Tian, Igor Fedorov, Yunyang Xiong, Ernie Chang, Yangyang Shi, Raghuraman Krishnamoorthi, Liangzhen Lai, Vikas Chandra
This paper addresses the growing need for efficient large language models (LLMs) on mobile devices, driven by increasing cloud costs and latency concerns. We focus on designing top-quality LLMs with fewer than a billion parameters, a practical choice for mobile deployment. Contrary to prevailing belief emphasizing the pivotal role of data and parameter quantity in determining model quality, our investigation underscores the significance of model architecture for sub-billion scale LLMs. Leveraging deep and thin architectures, coupled with embedding sharing and grouped-query attention mechanisms, we establish a strong baseline network denoted as MobileLLM, which attains a remarkable 2.7%/4.3% accuracy boost over preceding 125M/350M state-of-the-art models. Additionally, we propose an immediate block-wise weight-sharing approach with no increase in model size and only marginal latency overhead. The resultant models, denoted as MobileLLM-LS, demonstrate a further accuracy enhancement of 0.7%/0.8% than MobileLLM 125M/350M. Moreover, MobileLLM model family shows significant improvements compared to previous sub-billion models on chat benchmarks, and demonstrates close correctness to LLaMA-v2 7B in API calling tasks, highlighting the capability of small models for common on-device use cases.
350M parameters is all you need! ⚡
Revisiting Meta's MobileLLM paper this morning:
> Reaches same perf as L2 7B in API callling competitive at chat
> Train thin and deep networks (instead of wide)
> Grouped Query Attention (even for smaller networks)
> Block wise weight sharing (in-between adjacent blocks)
> Replace ReLU in FFN w/ SwiGLU
> Share embeddings (Input embedding weights w/ output fully connected layer weights)
They also scale the model up to 1.5B to see that these architectural changes hold. Spoiler alert: It does (see picture)
I don't expect Smol models to be incredibly great at Chat. However, I would expect them to work as routers to route to other on-device APIs or function calls. At 350M and W8A8, the overall size required to run the model is 350MB, which increases the surface area of devices to which we can bring "intelligence".
The paper links to a GH repo with pre-training code, but it's inaccessible. Do you plan to make it available AI at Meta?
Could be cool to reproduce this with FineWeb-Edu and see if the increase in pre-training quality results in even better downstream results!

###
https://www.technologyreview.kr/synthesias-hyperrealistic-deepfakes-will-soon-have-full-bodies/
Introducing Synthesia 2.0, the world’s first AI video communications platform built for the future of work
WRITTEN BY
DAN-VLAD COBASNEANU
PUBLISHED ON
JUNE 25, 2024


Turn your texts, PPTs, PDFs or URLs to video - in minutes.

Learn more

Synthesia 2.0 is the world’s first AI video communications platform, reinventing every aspect of the video production and distribution process to help businesses create and share AI generated videos at scale
We’re introducing two new types of Personal AI Avatars and giving you a glimpse of a new generation of AI avatars coming later this year (spoiler alert: they have hands!)
AI Video Assistant will convert an entire knowledge base into a library of videos and supports brand elements such as an organization’s custom fonts, colors or logos
AI Screen Recorder is a new product that allows you to turn screen recordings into beautiful video presentations, powered by AI avatars
We’re building a new video player that can offer personalized and real-time, interactive experiences
Thanks to its pioneering work on AI safety, Synthesia is on track to achieve ISO/IEC 42001 certification, ensuring the responsible development and use of AI systems.
Today, we’re introducing Synthesia 2.0—the world’s first AI video communications platform for business—and sharing with you the new products and features we’re building to improve the way organizations and individuals communicate and share information.


Over the past 100 years, we've seen the rise of radio, television, the internet, and social media slowly shifting the way we communicate and share information, from text to video and audio. Just over a decade ago, video made up about 30% of internet traffic; today, it’s over 82% and growing exponentially. Globally, people spend on average 3 billion hours per day on TikTok, 1 billion hours per day on YouTube, and over 200 million hours per day on Netflix.

So, in our everyday lives, it’s clear that we’re already living in a video-first world. However, at work, we’re not quite there yet: most of our business communications still heavily rely on text while video is limited to major brand moments such as ads or keynotes or daily business interactions like video conferencing.

With Synthesia 2.0, we aim to reinvent every step of the video production pipeline from the ground up and create a single, powerful, and easy-to-use platform, enabling your entire business to transition to a video-first world and drive real business outcomes.

Introducing Personal AI Avatars

Avatars are at the core of Synthesia, and we’re constantly working on improving the quality and capabilities.

We’ve made it our goal to create the world’s most realistic AI avatars to help humans augment their capabilities. Last month, we introduced the world’s first Expressive AI Avatars, powered by our EXPRESS-1 model. These avatars understand what they’re saying and how they should say it, adjusting their tone of voice, facial expressions and body language based on the context of your script.

Many of our customers want to have their own avatar. With Synthesia 2.0 we’re making it a much easier experience and significantly increasing the quality and capabilities.

With Synthesia 2.0, you will have two ways of creating a personal avatar

An Expressive Avatar shot in a studio using high-definition cameras for a professional feel
A custom avatar in a natural background, using your webcam or phone at home or on the go. These new avatars improve on our existing webcam offering by providing better lip synchronization and a more natural voice, together with the ability to replicate your voice in over 30 languages

‎But we’re not stopping here.

Today, I am excited to share with you a glimpse into the future of our AI Avatars. Over the last 12 months, we’ve been capturing thousands of people in our studios all over the world. With this data, we’ve been training several large video and audio foundation models that can now work in lockstep to produce incredibly realistic and engaging avatars.

Up until now, avatars have mainly served as assistants in video. With this next generation they will be able to have personalities and tell captivating stories by using the full range of body language available to humans, including their hands. These new AI avatars will also be fully controllable: users will be able to specify avatar appearance with images and videos, and create animations with skeleton sequences.

Below you can see a clip of these full-body avatars in action:


Expect more news from us on this topic later in the year.

Bulk creation and brand templates coming to AI Video Assistant

If you've ever tried to write a script, you're probably familiar with “writer’s block” or the fear of the blank page.

To solve this problem, earlier this year we introduced our AI Video Assistant. Today, it enables you to simply select a template, write a prompt, upload an existing document or link, specify things like the tone of voice, length of your video, or audience, and with a click of a button, you get a draft of your video.

Since we launched it, it’s been widely adopted by our customers, and we’ve received great feedback on how we can improve it.

One key request was for the AI video assistant to incorporate your brand identity. We’re making this feature available next month, allowing users to create videos automatically with their brand elements, such as typography, colors, and logos, and achieve a consistent look and feel for all your videos.


A few months ago, during a conversation with one of our customers, we discovered they have hundreds of help articles they wish to convert into videos, as this would help their customers find answers more easily and save resources for their customer service team.

So we’re building bulk video creation with our AI Video Assistant. Soon you'll be able to simply select a template, provide a link to your knowledge center, and the AI video assistant will transform the articles into high-quality videos.


More intuitive editing with Triggers and our new AI Screen Recorder

Another thing we’ve learned from our customers is that most video editing tools are designed for professionals, or require extensive training. With Synthesia, we've dramatically simplified the editing process, without compromising on flexibility. In fact, 9 out of 10 people can create their first video in less than 10 minutes, without prior experience.

We’ve achieved that by replacing the traditional video timeline with simple triggers that you can control directly from your script. This change puts your script at the heart of your story, allowing you to animate video elements and make edits in a simple and intuitive way. It also simplifies scene content generation, creating a whole new editing experience that’s easy to use for everyone.


But what we’ve also learned is that many of our customers need to include screen-recorded content in their videos, but find the process complicated. Today, you’d have to use multiple tools to capture your screen, edit the recording, match the voiceover, and if you need to update it, you have to start all over again.

We believe there’s a better way with our upcoming AI Screen Recorder. Here’s how it works: let’s imagine we’re creating a step-by-step guide using a screen recorder so employees can see how to book time off through an online HR system.

You will be able to do this from Synthesia using the AI Screen Recorder. Once the recording is done, the video is immediately available for editing, with the voiceover transcribed, perfectly matching the screen capture, and automatic zoom effects to emphasize key actions.

From here, you can edit the script if needed, trim the video, and even add your own avatar and voice for a personal touch. The result is a sleek, high-quality video that can be easily updated.


The AI Screen Recorder is coming to Synthesia in the next few months.

Translations and a new, dynamic video player

Out of 4.2 billion internet users, only about 25% are English speakers. In a world where employees and customers are distributed globally, adapting communication to local languages and cultures is not just an option; it’s a massive business opportunity.

Translations are a complicated process which can take weeks or even months, delaying important communications and increasing costs.

About a year ago, we introduced the 1-click translations feature in Synthesia, which enables you to automatically translate your videos into over 120 languages with one click.

And even though that unlocked massive productivity gains for our customers, they still had to manage and maintain and share multiple files, which wasn't a good experience.

Today, we’re introducing the updated translation experience in Synthesia. You simply create one version of your video, translate it into any language you want, and if you need to update your video, just make changes to the original version. All other language versions will update automatically.


We are building a new type of video player, one that we believe will enable a new generation of video experiences that are interactive, personalized, and fun. The first feature we’re launching next month is the ability to simply share your video, and our player will automatically play it in your viewer's language. It’s quite magical and truly complements our translation capabilities.


Later in the year, we’re launching a whole suite of interactive capabilities for our player. You will be able to create rich video experiences with features such as clickable hotspots, embedded forms, quizzes, and personalized call-to-actions.


These capabilities will make your videos more engaging, drive higher viewer interaction, and unlock use cases that are simply impossible today.

AI safety built in from day one

We know generative AI is a powerful technology. We’ve seen how, in the hands of companies or individuals that don’t care about using AI responsibly, it can be misused.

That’s why, from day one, we’ve treated AI safety as a core part of building our products and growing our business - you can read more about our approach to responsible AI here. By doing so, we give our customers confidence that they can leverage our state-of-the-art AI capabilities while upholding ethical and legal obligations.

Thanks to these investments that we’ve made early on, Synthesia will soon be the first AI company in the world to achieve ISO/IEC 42001 certification. ISO/IEC 42001 is the world’s first standard for AI management, providing a structured way to manage risks and opportunities associated with AI, and balancing innovation with governance.

Be the first to experience Synthesia 2.0

We’ve reinvented every step of video production from the ground up and created one, incredibly powerful, yet remarkably easy-to-use platform, enabling your business to transition to a video-first world and drive business outcomes.


###
https://www.anthropic.com/news/evaluate-prompts
Evaluate prompts in the developer console
2024년 7월 10일

2 min read
Illustration of Claude using tools
When building AI-powered applications, prompt quality significantly impacts results. But crafting high quality prompts is challenging, requiring deep knowledge of your application's needs and expertise with large language models. To speed up development and improve outcomes, we've streamlined this process to make it easier for users to produce high quality prompts.

You can now generate, test, and evaluate your prompts in the Anthropic Console. We've added new features, including the ability to generate automatic test cases and compare outputs, that allow you to leverage Claude to generate the very best responses for your needs.

Generate prompts
Writing a great prompt can be as simple as describing a task to Claude. The Console offers a built-in prompt generator, powered by Claude 3.5 Sonnet, that allows you to describe your task (e.g. “Triage inbound customer support requests”) and have Claude generate a high-quality prompt for you.

App screen of Anthropic Console prompt generator
You can use Claude’s new test case generation feature to generate input variables for your prompt—for instance, an inbound customer support message—and run the prompt to see Claude’s response. Alternatively, you can enter test cases manually.

App screen of prompt generation and Claude response
Generate a test suite
Testing prompts against a range of real-world inputs can help you build confidence in the quality of your prompt before deploying it to production. With the new Evaluate feature you can do this directly in our Console instead of manually managing tests across spreadsheets or code.

Manually add or import new test cases from a CSV, or ask Claude to auto-generate test cases for you with the ‘Generate Test Case’ feature. Modify your test cases as needed, then run all of the test cases in one click. View and adjust Claude’s understanding of the generation requirements for each variable to get finer-grained control over the test cases Claude generates.

App screen of comparison mode of different prompt responses
Evaluate model responses and iterate on prompts
Refining your prompt now takes fewer steps, since you can create new versions of the prompt and re-run the test suite to quickly iterate and improve your results. We’ve also added the ability to compare the outputs of two or more prompts side by side.

You can even have subject matter experts grade response quality on a 5-point scale in order to see whether the changes you’ve made have improved response quality. Both of these features enable a faster and more accessible way to improve model performance.


Get started
Test case generation and output comparison features are available to all users on the Anthropic Console. To learn more about how to generate and evaluate prompts with Claude, check out our docs.

###
https://github.com/microsoft/MInference
Microsoft
MInference: Million-Tokens Prompt Inference for Long-context LLMs
| Project Page | Paper | HF Demo |

MInference_demo.mp4
Now, you can process 1M context 10x faster in a single A100 using Long-context LLMs like LLaMA-3-8B-1M, GLM-4-1M, with even better accuracy, try MInference 1.0 right now!

News
🪗 [24/07/07] Thanks @AK for sponsoring. You can now use MInference online in the HF Demo with ZeroGPU.
📃 [24/07/03] Due to an issue with arXiv, the PDF is currently unavailable there. You can find the paper at this link.
🧩 [24/07/03] We will present MInference 1.0 at the Microsoft Booth and ES-FoMo at ICML'24. See you in Vienna!
TL;DR
MInference 1.0 leverages the dynamic sparse nature of LLMs' attention, which exhibits some static patterns, to speed up the pre-filling for long-context LLMs. It first determines offline which sparse pattern each head belongs to, then approximates the sparse index online and dynamically computes attention with the optimal custom kernels. This approach achieves up to a 10x speedup for pre-filling on an A100 while maintaining accuracy.

MInference 1.0: Accelerating Pre-filling for Long-Context LLMs via Dynamic Sparse Attention (Under Review, ES-FoMo @ ICML'24)
Huiqiang Jiang†, Yucheng Li†, Chengruidong Zhang†, Qianhui Wu, Xufang Luo, Surin Ahn, Zhenhua Han, Amir H. Abdi, Dongsheng Li, Chin-Yew Lin, Yuqing Yang and Lili Qiu

###
https://github.com/THUDM/CodeGeeX4
7/5/24
CodeGeeX4: Open Multilingual Code Generation Model
We introduce CodeGeeX4-ALL-9B, the open-source version of the latest CodeGeeX4 model series. It is a multilingual code generation model continually trained on the GLM-4-9B, significantly enhancing its code generation capabilities. Using a single CodeGeeX4-ALL-9B model, it can support comprehensive functions such as code completion and generation, code interpreter, web search, function call, repository-level code Q&A, covering various scenarios of software development. CodeGeeX4-ALL-9B has achieved highly competitive performance on public benchmarks, such as BigCodeBench and NaturalCodeBench. It is currently the most powerful code generation model with less than 10B parameters, even surpassing much larger general-purpose models, achieving the best balance in terms of inference speed and model performance.

Model List
Model Type Seq Length Download
codegeex4-all-9b Chat 128K 🤗 Huggingface 🤖 ModelScope 🟣 WiseModel
New Drop: CodeGeeX4 9B by ChatGLM 🔥
> Beats CodeLlama 70B (7x size), competitive with DeepSeek Coder 33B
> Multilingual code generation model, continually trained on ChatGLM 9B
> Upto 128K context
> Supports code completion and generation, code interpreter, web search, function call, repository-level code Q&A
> Scored 48.9 and 40.4 for the complete and instruct tasks of BigCodeBench
It is a powerful model for a local code assistant. Congrats, the THUDM team, on yet another brilliant release!

###
https://huggingface.co/BAAI/EVE-7B-v1.0
EVE: Unveiling Encoder-Free Vision-Language Models
Unveiling Encoder-Free Vision-Language Models
Haiwen Diao*, Yufeng Cui*, Xiaotong Li, Yueze Wang, Huchuan Lu📧, Xinlong Wang📧

Dalian University of Technology; Beijing Academy of Artificial Intelligence; Peking University

| Paper | Code |
[Submitted on 17 Jun 2024]
Existing vision-language models (VLMs) mostly rely on vision encoders to extract visual features followed by large language models (LLMs) for visual-language tasks. However, the vision encoders set a strong inductive bias in abstracting visual representation, e.g., resolution, aspect ratio, and semantic priors, which could impede the flexibility and efficiency of the VLMs. Training pure VLMs that accept the seamless vision and language inputs, i.e., without vision encoders, remains challenging and rarely explored. Empirical observations reveal that direct training without encoders results in slow convergence and large performance gaps. In this work, we bridge the gap between encoder-based and encoder-free models, and present a simple yet effective training recipe towards pure VLMs. Specifically, we unveil the key aspects of training encoder-free VLMs efficiently via thorough experiments: (1) Bridging vision-language representation inside one unified decoder; (2) Enhancing visual recognition capability via extra supervision. With these strategies, we launch EVE, an encoder-free vision-language model that can be trained and forwarded efficiently. Notably, solely utilizing 35M publicly accessible data, EVE can impressively rival the encoder-based VLMs of similar capacities across multiple vision-language benchmarks. It significantly outperforms the counterpart Fuyu-8B with mysterious training procedures and undisclosed training data. We believe that EVE provides a transparent and efficient route for developing a pure decoder-only architecture across modalities.


###
https://github.com/cohere-ai/cohere-toolkit
Cohere Toolkit
Toolkit is a collection of prebuilt components enabling users to quickly build and deploy RAG applications.

Try Toolkit
About Toolkit
Toolkit Setup
Troubleshooting
How to guides
How to set up command model providers
How to add tools
How to add authentication
How to deploy toolkit services
How to set up Github Actions for automated DB migrations
How to customize the theme
How to contribute
Try Cohere's Command Showcase


Try Now:
Try the default Toolkit application yourself by deploying it in a container locally. Either with docker run, using the pre-built Docker image provided (note: this does not contain community tools):

docker run -e COHERE_API_KEY='>>YOUR_API_KEY<<' -p 8000:8000 -p 4000:4000 ghcr.io/cohere-ai/cohere-toolkit:latest
or cloning and running locally:

Note: to include community tools when building locally, set the INSTALL_COMMUNITY_DEPS build arg in the docker-compose.yml to true.

git clone https://github.com/cohere-ai/cohere-toolkit.git
cd cohere-toolkit
make first-run
Go to localhost:4000 in your browser and start chatting with the model.

For the above you will need to have Docker and Docker-compose >= 2.22 installed. Go here for a more detailed setup.

About Toolkit


Interfaces - these can be any frontend, application, bot or integration. You can customize any type of interface for your use case. By default included is:
Cohere's Web UI at src/interfaces/coral_web - A web app built in Next.js. Includes a simple SQL database out of the box to store conversation history in the app.
Backend API - src/backend this follows a similar structure to the Cohere Chat API but also include customizable elements:
Model - you can customize with which provider you access Cohere's Command models. By default included in the toolkit is Cohere's Platform, Sagemaker, Azure, Bedrock, HuggingFace, local models. More details here.
Retrieval- you can customize tools and data sources that the application is run with. By default, we have configured a Langchain data retriever to test RAG on Wikipedia and your own uploaded documents. It is possible to add any tool including any tools or retrievers from LangChain or LlamaIndex. You can also use a connector you have created.
Service Deployment Guides - we also include guides for how to deploy the toolkit services in production including with AWS, GCP and Azure. More details here.
Contributing

###
https://huggingface.co/papers/2407.02552
RLHF Can Speak Many Languages: Unlocking Multilingual Preference Optimization for LLMs
Published on Jul 3
Authors:

John Dang
,

Arash Ahmadian
,

Kelly Marchisio
,

Julia Kreutzer
,
Ahmet Üstün
,

Sara Hooker
Abstract
Preference optimization techniques have become a standard final stage for training state-of-art large language models (LLMs). However, despite widespread adoption, the vast majority of work to-date has focused on first-class citizen languages like English and Chinese. This captures a small fraction of the languages in the world, but also makes it unclear which aspects of current state-of-the-art research transfer to a multilingual setting. In this work, we perform an exhaustive study to achieve a new state-of-the-art in aligning multilingual LLMs. We introduce a novel, scalable method for generating high-quality multilingual feedback data to balance data coverage. We establish the benefits of cross-lingual transfer and increased dataset size in preference training. Our preference-trained model achieves a 54.4% win-rate against Aya 23 8B, the current state-of-the-art multilingual LLM in its parameter class, and a 69.5% win-rate or higher against widely used models like Gemma-1.1-7B-it, Llama-3-8B-Instruct, Mistral-7B-Instruct-v0.3. As a result of our study, we expand the frontier of alignment techniques to 23 languages covering half of the world's population.
Does RLHF transfer to different languages? RLHF Can Speak Many Languages! Cohere shows that training one or multiple languages improves the performance of unseen languages and shows that online RLHF methods have stronger transfer capabilities than offline methods. 👀
Experiments:
1️⃣ Created synthetic multilingual preference dataset using ~50K English prompts from ShareGPT, translated to 22 languages. Completions were generated using Cohere's Command and Command R+ models, with Cohere May 2024 as the Reward Model.
2️⃣ Created 4 dataset Mixtures: EN-1-50K: English-only, 50K prompts; ML-5-50K: 5 languages, 10K prompts each; ML-23-50K: 23 languages, ~2.2K prompts each; ML-23-230K: 23 languages, 10K prompts each.
3️⃣ Used Aya 23 8B as Base Model (SFT) and for RLHF DPO (offline) and RLOO (online).
4️⃣ Trained on all mixtures with both methods and evaluated them using win-rates judged GPT-4-Turbo.
Learnings:
🌐 Training only on English preference data leads to up to 7% performance improvements on other languages.
🌍 Training in 5 languages increased win rates in unseen languages by up to 19%.
⚡ Online (RLOO) outperforms offline (DPO) by up to 10.6% in average win-rates.
🔄 Online (RLOO) shows stronger language transfer capabilities than offline (DPO).
📈 Increasing data from 2K to 10K examples per language improves DPO but not RLOO.

###
https://arxiv.org/abs/2407.01219
[Submitted on 1 Jul 2024]
Searching for Best Practices in Retrieval-Augmented Generation
Xiaohua Wang, Zhenghua Wang, Xuan Gao, Feiran Zhang, Yixin Wu, Zhibo Xu, Tianyuan Shi, Zhengyuan Wang, Shizheng Li, Qi Qian, Ruicheng Yin, Changze Lv, Xiaoqing Zheng, Xuanjing Huang
Retrieval-augmented generation (RAG) techniques have proven to be effective in integrating up-to-date information, mitigating hallucinations, and enhancing response quality, particularly in specialized domains. While many RAG approaches have been proposed to enhance large language models through query-dependent retrievals, these approaches still suffer from their complex implementation and prolonged response times. Typically, a RAG workflow involves multiple processing steps, each of which can be executed in various ways. Here, we investigate existing RAG approaches and their potential combinations to identify optimal RAG practices. Through extensive experiments, we suggest several strategies for deploying RAG that balance both performance and efficiency. Moreover, we demonstrate that multimodal retrieval techniques can significantly enhance question-answering capabilities about visual inputs and accelerate the generation of multimodal content using a "retrieval as generation" strategy.


###
https://huggingface.co/papers/2407.01906
Let the Expert Stick to His Last: Expert-Specialized Fine-Tuning for Sparse Architectural Large Language Models
Published on Jul 2
·
Submitted by
philschmid
on Jul 5
#2 Paper of the day
Authors:
Zihan Wang
,
Deli Chen
,
Damai Dai
,
Runxin Xu
,
Zhuoshu Li
,
Y. Wu
Abstract
Parameter-efficient fine-tuning (PEFT) is crucial for customizing Large Language Models (LLMs) with constrained resources. Although there have been various PEFT methods for dense-architecture LLMs, PEFT for sparse-architecture LLMs is still underexplored. In this work, we study the PEFT method for LLMs with the Mixture-of-Experts (MoE) architecture and the contents of this work are mainly threefold: (1) We investigate the dispersion degree of the activated experts in customized tasks, and found that the routing distribution for a specific task tends to be highly concentrated, while the distribution of activated experts varies significantly across different tasks. (2) We propose Expert-Specialized Fine-Tuning, or ESFT, which tunes the experts most relevant to downstream tasks while freezing the other experts and modules; experimental results demonstrate that our method not only improves the tuning efficiency, but also matches or even surpasses the performance of full-parameter fine-tuning. (3) We further analyze the impact of the MoE architecture on expert-specialized fine-tuning. We find that MoE models with finer-grained experts are more advantageous in selecting the combination of experts that are most relevant to downstream tasks, thereby enhancing both the training efficiency and effectiveness.

###
https://teams.microsoft.com/l/message/19:f739398261c24c98805f7d4dc4adbcd6@thread.tacv2/1720572108761?tenantId=fcc97bbd-70b6-4980-b261-7129b612d32f&groupId=67023d7e-4e43-40f6-ae38-c4fd60ba57bc&parentMessageId=1720572108761&teamName=%EA%B8%88%EC%9C%B5AI%EC%84%BC%ED%84%B0%20AI%ED%85%8C%ED%81%AC%ED%8C%80&channelName=AI%EC%84%A0%ED%96%89%EA%B8%B0%EC%88%A0&createdTime=1720572108761
6/25/24
미국-중국 AI 경쟁에 관한 Scale의 알렉스 왕의 견해
AI가 세계 권력 균형에 미치는 영향 및 국가 보안 확보 방법
2024년 6월 25일, ChinaTalk는 Scale AI의 CEO 알렉스 왕과의 인터뷰를 통해 AI가 세계 권력 균형에 미치는 영향과 미국 및 동맹국이 국가 보안을 유지하기 위한 방안을 논의했습니다.
주요 논의 내용
AGI(Artificial General Intelligence) 발전의 세 가지 주요 요인:
컴퓨팅 파워: 무어의 법칙과 컴퓨팅 파워의 증가로 인해 과거보다 훨씬 더 많은 컴퓨팅 자원을 사용할 수 있게 되었습니다. 특히 GPU의 발전과 함께 대규모 컴퓨팅 클러스터를 활용한 모델 훈련이 가능해졌습니다.
데이터: 2010년대 초반부터 딥러닝과 신경망의 사용이 증가하면서 데이터 사용량이 기하급수적으로 증가했습니다. 초기의 Imagenet 데이터셋부터 현재까지 모델 훈련에 사용되는 데이터 양은 계속해서 증가하고 있으며, 이는 모델의 성능을 향상시키는 데 중요한 역할을 합니다.
알고리즘: 알고리즘의 혁신은 AI 발전의 또 다른 주요 요소입니다. 대형 언어 모델의 훈련은 사전 훈련과 후속 훈련의 두 단계로 나뉘며, 후속 훈련에서는 고품질 데이터를 사용하여 모델을 실용적인 기능으로 최적화합니다.
중국의 AGI 경쟁력:
강점: 대규모 데이터 수집 및 활용 능력, 정부의 강력한 지원, 빠른 기술 도입.
약점: 최첨단 반도체 제조 및 고성능 컴퓨팅 인프라에서의 제한. 엔비디아와 같은 선도적인 반도체 회사와 비교할 때 여전히 성능 및 비용 면에서 뒤처져 있습니다.
국가 안보와 AI 경쟁:
국가 보안의 중요성: AI 경쟁에서 승리하는 것은 국가 안보에 중요한 영향을 미칠 수 있습니다. AI가 군사 기술로 활용될 경우, 이는 전쟁과 억제력의 양상을 근본적으로 변화시킬 수 있습니다.
데이터의 중요성: AI 모델의 성능을 극대화하기 위해서는 고품질의 최첨단 데이터가 필요합니다. 이는 전문 지식과 경험을 포함하는 데이터로, AI 모델이 특정 작업에서 뛰어난 성능을 발휘할 수 있도록 합니다.
AI 스파이 활동 방지 방법:
보안 강화: 연구소 및 AI 개발 환경의 보안 수준을 대폭 강화해야 합니다. 최근 구글 엔지니어의 스파이 사건에서 보듯이, 현재의 보안 수준은 매우 취약합니다.
이민자 프로파일링 방지: 최고의 인재를 유치하기 위해 이민자에 대한 편견을 줄이고 개방적인 환경을 유지해야 합니다. 이는 미국이 AI 경쟁에서 지속적인 우위를 점할 수 있는 중요한 요소입니다.
데이터의 벽 넘기
알렉스 왕은 AI의 미래 발전이 데이터의 양과 질에 크게 의존한다고 설명했습니다. 인터넷의 데이터는 거의 다 사용되었고, 새로운 데이터를 생성하는 속도는 매우 느립니다. 따라서 고품질의 전문 데이터를 대량으로 생산하는 것이 중요합니다. 이는 인공지능 모델이 자기 학습을 통해 향상될 수 있는 환경을 구축하는 것을 포함합니다.
전문가 데이터 확보:
최고 전문가들의 지식과 경험을 포함하는 데이터를 대량으로 생산. 이를 통해 AI 모델이 특정 작업에서 뛰어난 성능을 발휘할 수 있도록 합니다.
합성 데이터 생성:
게임과 같은 합성 환경을 구축하여 AI 모델이 상호 작용하고 학습할 수 있게 합니다. 이러한 환경은 인간 전문가가 구축하며, 이를 통해 모델의 학습 효과를 극대화합니다.
데이터 보안 강화:
산업 스파이 활동을 방지하기 위한 강력한 보안 조치 도입. 특히 AI 모델의 훈련 가중치와 같은 중요한 데이터를 보호하는 것이 중요합니다.
미국의 AI 데이터 우위 확보 방법
고품질 전문 데이터 생산:
고품질의 전문 데이터를 대량으로 생산하여 AI 모델의 성능을 극대화합니다. 이는 금융 분석가, 군사 전문가, 정보 분석가 등의 지식과 경험을 포함하는 데이터를 의미합니다.
합성 환경 구축:
AI 모델이 자기 학습을 통해 향상될 수 있는 합성 환경을 구축합니다. 이는 게임과 같은 환경에서 모델이 상호 작용하고 학습할 수 있게 하는 것을 의미합니다.
데이터 보안 강화:
AI 연구소와 개발 환경의 보안 수준을 대폭 강화하여 산업 스파이 활동을 방지합니다. 특히 AI 모델의 훈련 가중치와 같은 중요한 데이터를 보호하는 것이 중요합니다.
중국의 AI 경쟁력 평가
알렉스 왕은 중국의 AI 생태계를 평가하며, 다음과 같은 주요 지표를 제시했습니다:
중국 칩의 품질: 화웨이의 Ascend 910B 칩 성능은 엔비디아 칩과 비교할 때 80% 수준이며, 비용 면에서는 2~3배 더 비쌉니다.
생산량: 화웨이는 분기마다 약 10만 개의 칩을 생산하는 반면, 엔비디아는 100만 개를 생산합니다. 이 비율을 주의 깊게 모니터링해야 합니다.
전력 공급: 중국은 미국보다 훨씬 더 많은 전력을 추가하고 있으며, 이는 AI 발전에 중요한 요소입니다. 특히 원자력 발전에서 중국은 미국보다 유리한 위치에 있습니다.
결론
AI는 국가 안보에 중요한 영향을 미칠 수 있으며, 미국은 AI 경쟁에서 승리하기 위해 고품질 데이터 확보, 보안 강화, 인재 유치 및 보호에 중점을 두어야 합니다. 미국 정부는 AI를 군사 기술로 간주하고 이에 대한 투자를 확대해야 합니다. 이는 장기적인 국가 안보와 경제 성장을 위한 필수적인 조치입니다.

###
https://group.bnpparibas/en/press-release/bnp-paribas-and-mistral-ai-sign-a-partnership-agreement-covering-all-mistral-ai-models
Find here the latest press releases from BNP Paribas

BACK TO PRESS RELEASES
PRESS RELEASE
BNP Paribas and Mistral AI sign a partnership agreement covering all Mistral AI models
Published on 10.07.2024

CREATE AN EMAIL ALERT

The agreement is a multi-year partnership to provide access to current and future Mistral AI commercial models across all the bank’s business lines. It follows a relationship dating back to September 2023 when the Global Markets division of BNP Paribas began experimenting with Mistral AI’s models. This first engagement produced strong results and, as a consequence, BNP Paribas extended the collaboration to the wider Group starting February 2024. Since that time, BNP Paribas has been extensively piloting Mistral AI commercial models across several of the bank’s divisions.

By using Mistral AI Large Language Models, BNP Paribas is developing a number of use cases in its businesses across customer support, sales, IT and other areas. Mistral AI’s offering and strategy is complementary to highly regulated institutions, facilitating controlled deployment of cutting-edge models on premises. A further advantage of working with Mistral AI is scalability, as they strive to deliver energy-efficient models.

“Our collaboration with BNP Paribas signifies a strong leap towards achieving Mistral AI's mission of making AI accessible for all. We are pleased to be working so closely with their team, integrating our cutting-edge generative AI models into the banking ecosystem and so many of their business lines. I eagerly anticipate the expansion of our partnership, as we continue to develop innovative use cases that will redefine the future of financial services.” Arthur Mensch, CEO, Mistral AI

“Our agreement with Mistral AI marks a major milestone in our digital strategy, and our ambition to be the number one European markets house. Generative AI has significant potential to enhance our client offering across sales, trading, research and more, and I am excited to continue our work with Mistral AI towards that goal.” Olivier Osty, Head of BNP Paribas Global Markets

"This partnership with Mistral AI marks a further step in developing hyper-personalised digital services for our customers. As an example, Gen AI will allow us to launch high quality virtual assistants to answer clients’ questions 24/7 and to simplify end-to-end processes, enhancing the way our teams support clients. Deploying Gen AI models within our infrastructure will ally the latest technology with our strong commitment for security." Sophie Heller, Chief Operating Officer at BNP Paribas Commercial, Personal Banking & Services