dnth commited on
Commit
6e22f4d
·
verified ·
1 Parent(s): e60f1e7

Add new SentenceTransformer model

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,635 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - sentence-transformers
4
+ - sentence-similarity
5
+ - feature-extraction
6
+ - dense
7
+ - generated_from_trainer
8
+ - dataset_size:7540
9
+ - loss:MultipleNegativesRankingLoss
10
+ base_model: nomic-ai/modernbert-embed-base
11
+ widget:
12
+ - source_sentence: The Chief Engineer/Senior Engineering Manager (Automatic Fare Collection)
13
+ leads and facilitates the implementation of Automatic Fare Collection (AFC) maintenance
14
+ regime within the organisation. He/She works closely with the authorities in implementing
15
+ new engineering initiatives to enhance the reliability of AFC systems. He demonstrates
16
+ his technical expertise in providing advice to cross-disciplinary engineering
17
+ studies. His role also includes the establishment of competency standards and
18
+ engineering standards to ensure staff are equipped with relevant skills. He excels
19
+ in operating in a collaborative environment and functions through his understanding
20
+ of the operational activities, industry developments and regulatory requirements.
21
+ He maintains a forward-thinking mindset to contribute strategically towards achieving
22
+ the department's goals.
23
+ sentences:
24
+ - The Chief Engineer/Senior Engineering Manager (Rail Signalling Systems) manages
25
+ the maintenance and upgrade schedules for rail signalling infrastructure across
26
+ the network. He/She partners with transportation authorities to implement new
27
+ signalling technologies and provides expert guidance on safety and technical protocols.
28
+ This role focuses on developing and enforcing engineering standards specific to
29
+ signalling equipment and training personnel accordingly. The manager thrives in
30
+ a multidisciplinary team environment, using knowledge of signalling operations,
31
+ regulatory mandates, and emerging technologies to ensure system safety and reliability,
32
+ while contributing to strategic planning within the transit division.
33
+ - The Assistant Horticulturist supports the management and nurturing of plant life
34
+ within the organisation’s attraction sites. This role involves assisting in the
35
+ upkeep of diverse plant collections and delivering informative presentations to
36
+ visitors about the flora and conservation efforts. With keen attention to detail
37
+ and a proactive approach, the Assistant Horticulturist monitors plant health and
38
+ characteristics, reporting observations accurately. The position requires the
39
+ ability to work independently or under supervision, includes physical tasks, and
40
+ involves working on a rotating schedule covering weekends, public holidays, and
41
+ on-call duties. Extended outdoor work in various weather conditions is expected,
42
+ and a valid driving licence may be necessary for duties in expansive park areas.
43
+ - The Chief Engineer/Senior Engineering Manager (Automatic Fare Collection) is responsible
44
+ for overseeing the deployment and upkeep of the AFC system maintenance program
45
+ within the organization. This role involves close collaboration with regulatory
46
+ bodies to introduce innovative engineering solutions aimed at improving AFC system
47
+ performance and dependability. The incumbent applies deep technical knowledge
48
+ to support interdisciplinary engineering projects and leads the formulation of
49
+ competency frameworks and technical standards to ensure team proficiency. Operating
50
+ effectively in a cooperative setting, the manager leverages insights into operational
51
+ workflows, industry trends, and compliance standards, adopting a strategic outlook
52
+ to drive the department's long-term objectives.
53
+ - source_sentence: The Senior Application Chemist leads technical work and projects
54
+ for product development and innovation, and validates the development of application-specific
55
+ solutions and new analytical methods, based on technological know-how. He/She
56
+ studies market trends and customer needs to assess the feasibility of expanding
57
+ existing product lines, in accordance with the organisations business needs. The
58
+ Senior Application Chemist supports the technical service team by managing the
59
+ execution of technical service, application and product development-related projects
60
+ with customers. He also provides technical expertise in troubleshooting technical
61
+ issues reported by customers. In addition, he coaches and mentors junior staff
62
+ in the application team, and is responsible for managing the teams performance
63
+ to achieve organisational goals. The Senior Application Chemist leads a team in
64
+ the laboratory, and collaborates closely with the technical service, Research
65
+ and Development (R&D), and sales and marketing teams. He is creative and enjoys
66
+ solving complex problems. He can manage multiple projects effectively, and possesses
67
+ excellent technical writing and presentation skills.
68
+ sentences:
69
+ - The Relationship Management Director - Commercial leads the development and implementation
70
+ of client acquisition strategies, providing clear guidelines to support team members
71
+ in cultivating strong client partnerships. This role involves staying informed
72
+ about industry trends and sub-sector developments to enhance client service offerings.
73
+ The director ensures the team is well-trained on relevant market changes and oversees
74
+ credit analysis procedures in compliance with company standards. By guiding and
75
+ motivating the team, the director drives performance excellence and fosters a
76
+ professional environment that nurtures long-term client engagement. Possessing
77
+ keen business insight, the director identifies growth opportunities and influences
78
+ stakeholders effectively to achieve business goals, while maintaining a focus
79
+ on continuous improvement and team cohesion.
80
+ - The Senior Regulatory Affairs Specialist manages compliance projects within the
81
+ pharmaceutical industry, ensuring all products meet regional and international
82
+ regulatory requirements. This role involves coordinating submissions, monitoring
83
+ changes in legislation, and liaising with regulatory authorities to facilitate
84
+ product approvals. The Senior Regulatory Affairs Specialist leads a team responsible
85
+ for regulatory strategy and documentation, providing training and mentorship to
86
+ junior staff. Collaboration with quality assurance, manufacturing, and marketing
87
+ teams is essential to maintain adherence to regulatory standards. Strong project
88
+ management, attention to detail, and knowledge of regulatory frameworks are critical
89
+ for success in this position.
90
+ - The Senior Application Chemist is responsible for directing technical projects
91
+ and pioneering product innovations while developing and validating new analytical
92
+ techniques tailored to specific applications. This role involves analyzing market
93
+ trends and customer requirements to determine the potential for expanding product
94
+ offerings aligned with corporate objectives. The Senior Application Chemist collaborates
95
+ with the technical service team to oversee project execution related to applications
96
+ and product development, providing expert guidance in resolving customer technical
97
+ challenges. Additionally, the incumbent mentors junior team members, evaluates
98
+ team performance, and ensures alignment with organizational targets. Leading a
99
+ laboratory team, the Senior Application Chemist works closely with Research and
100
+ Development, sales, and marketing departments, demonstrating strong problem-solving
101
+ capabilities, effective multitasking, and proficient communication skills in technical
102
+ documentation and presentations.
103
+ - source_sentence: The Technician (Assembly) performs assembly tasks for aircraft
104
+ components in accordance with technical manuals and standard operating procedures
105
+ (SOPs). He/She operates workshop equipment, tools and machines for the assembly
106
+ of aircraft components. He also keeps abreast of latest developments of related
107
+ systems by updating himself through relevant manuals and other publications. He
108
+ may be authorised by the organisation to perform quality control functions, including
109
+ inspection of incoming materials and assembled components and parts, and registration
110
+ of non-conformances. He may also be authorised to perform level 1 non-destructive
111
+ testing (NDT) functions under supervision, evaluate for acceptance or rejection,
112
+ and record results as specified in the work instructions. He complies with airworthiness
113
+ and legislative requirements, and the organisation's safety, health and quality
114
+ systems. He supports in implementation of continuous improvement initiatives and
115
+ lean practices. He works in a hangar or workshop and may be required to work in
116
+ shifts. He should be systematic and detail-oriented, and able to work independently
117
+ and in a team to accomplish assigned tasks.
118
+ sentences:
119
+ - The Technician (Assembly) is responsible for assembling aircraft parts following
120
+ detailed technical manuals and established standard operating procedures. This
121
+ role involves the operation of various workshop machinery, tools, and equipment
122
+ to ensure precise assembly of aircraft components. The Technician stays updated
123
+ on the latest system advancements by reviewing relevant technical literature and
124
+ manuals. Authorized by the company, the Technician may conduct quality assurance
125
+ activities, including inspecting incoming materials and assembled parts, as well
126
+ as documenting any non-conformities. Additionally, the Technician may perform
127
+ supervised level 1 non-destructive testing (NDT), assessing components for compliance
128
+ and accurately recording results in line with work instructions. Compliance with
129
+ aviation safety standards, airworthiness regulations, and internal quality and
130
+ health protocols is essential. The Technician actively participates in continuous
131
+ improvement and lean methodology initiatives. Work is typically carried out in
132
+ a workshop or hangar environment, often involving shift work. The ideal candidate
133
+ is meticulous, organized, and capable of working autonomously or collaboratively
134
+ to complete assigned duties.
135
+ - The Assistant Stage Manager supports the Stage Manager throughout all phases of
136
+ production, including pre-production planning, rehearsals, live performances,
137
+ and post-production tasks. Responsibilities include attending production meetings,
138
+ facilitating communication among creative and technical teams, coordinating rehearsal
139
+ schedules, preparing and maintaining production documentation, and managing onstage
140
+ operations during rehearsals and shows as directed. They may also handle the procurement
141
+ and organization of props and costumes, and for extended runs, they might take
142
+ on show calling duties or serve as an alternate show caller to ensure seamless
143
+ performances.
144
+ - The Technician (Assembly) specializes in the repair and maintenance of automotive
145
+ engines, utilizing diagnostic tools and automotive repair equipment to troubleshoot
146
+ and fix mechanical issues. This role requires familiarity with vehicle service
147
+ manuals and adherence to road safety regulations and environmental standards.
148
+ The Technician performs routine inspections, identifies faulty parts, and carries
149
+ out component replacements to ensure optimal vehicle performance. Responsibilities
150
+ include maintaining detailed service records and collaborating with service advisors
151
+ to provide customers with accurate repair timelines. Work is conducted primarily
152
+ in an automotive workshop, with occasional overtime during peak periods. Strong
153
+ problem-solving skills, a customer-focused attitude, and the ability to work independently
154
+ or as part of a team are essential for success in this position.
155
+ - source_sentence: The Senior Infant Educator plays an active role as a mentor to
156
+ the Infant Educator team. He/She takes responsibility for coaching and leading
157
+ the infant care team in the Centre. He plays an important role in the design and
158
+ implementation of developmentally appropriate curricula and programmes for the
159
+ day-to-day developmental and caregiving tasks for infants. He also leads the building
160
+ of relationships and partnerships with stakeholders. He designs and implements
161
+ family and community programmes, and contributes to the Centres culture of continuous
162
+ learning, collaboration and collegiality, in line with its vision, mission and
163
+ goals.
164
+ sentences:
165
+ - The Associate Applications Support Engineer is tasked with maintaining and supporting
166
+ key software applications, whether developed internally or sourced from third
167
+ parties. This role requires comprehensive knowledge of application functionalities
168
+ and backend systems. The engineer collaborates closely with development, transition,
169
+ and testing teams to troubleshoot, document, and resolve application issues. Working
170
+ within a team environment, the engineer utilizes proficiency in application development
171
+ and monitoring tools aligned with organizational standards. Familiarity with the
172
+ software platforms hosting the solutions is essential. The role demands strong
173
+ analytical abilities, a problem-solving mindset, and excellent communication skills
174
+ to effectively address technical challenges.
175
+ - The Senior Toddler Educator leads the toddler care team by developing and managing
176
+ programmes focused on early childhood literacy and motor skills development. This
177
+ role emphasizes coordinating group activities and managing classroom logistics,
178
+ while maintaining compliance with childcare regulations specific to toddlers.
179
+ The Senior Toddler Educator also oversees staff scheduling and administrative
180
+ reporting, working closely with centre management to ensure operational efficiency.
181
+ - The Senior Infant Educator serves as a key mentor and leader within the Infant
182
+ Educator team, guiding and supporting staff in delivering high-quality infant
183
+ care. This role involves overseeing the creation and execution of age-appropriate
184
+ curricula and daily caregiving activities tailored to infants’ developmental needs.
185
+ Additionally, the Senior Infant Educator fosters strong collaborations with families
186
+ and community partners, designs family engagement initiatives, and promotes a
187
+ culture of ongoing learning and teamwork aligned with the Centre’s core values
188
+ and objectives.
189
+ - source_sentence: The Senior Manufacturing Planning Executive formulates production
190
+ plans and organises materials, manpower and resources to accomplish manufacturing
191
+ functions to fulfil customer and financial commitments. He/She validates the master
192
+ production schedule (MPS) and drives adherence of manufacturing works to project
193
+ schedules and goals in collaboration with cross-functional leads. He leads material
194
+ requirements planning and programme reviews with relevant stakeholders. He is
195
+ responsible for optimising supply chain and logistics planning, contract negotiations,
196
+ vendor sourcing, inventory planning and control and warehousing operations to
197
+ meet manufacturing requirements. He leverages data from supply chain management
198
+ (SCM) systems to enhance decision-making and implements supplier capability development
199
+ plans to enhance performance. He drives continuous improvements on product on-time
200
+ delivery and total available man-hours, develops strategies and priorities for
201
+ critical customer issues, facilitates problem-solving, leads in regular reviews
202
+ with customers and suppliers, and establishes best practices on process improvements
203
+ to enhance productivity. He proactively contributes to the development of lean
204
+ and sustainability practices, and conducts research and digital innovation in
205
+ targeted areas for continuous process improvements. As a team leader, he appraises
206
+ staff performance and conducts coaching and mentoring for planning personnel.
207
+ He should possess advanced statistical, forecasting and analytical skills to predict
208
+ planning and resource requirements. He is able to drive cross-functional collaboration
209
+ between internal and external stakeholders to optimise the planning processes
210
+ and ensure maximum resource utilisation.
211
+ sentences:
212
+ - The Senior Manufacturing Planning Executive develops and implements production
213
+ schedules while coordinating materials, workforce, and resources to meet manufacturing
214
+ targets aligned with customer demands and financial objectives. This role involves
215
+ validating the master production schedule and ensuring manufacturing activities
216
+ comply with project timelines through collaboration with various departments.
217
+ The executive leads material planning and program assessments with key partners
218
+ and is accountable for optimizing supply chain logistics, managing contracts,
219
+ sourcing vendors, controlling inventory, and overseeing warehouse operations to
220
+ support manufacturing needs. By utilizing supply chain management data, the executive
221
+ enhances decision-making and drives supplier capability improvements. They champion
222
+ continuous improvements in on-time delivery performance and labor efficiency,
223
+ formulate strategies to address critical customer concerns, facilitate problem
224
+ resolution, conduct stakeholder reviews, and promote best practices to boost productivity.
225
+ Additionally, the role supports lean methodologies and sustainability initiatives,
226
+ explores digital innovations, and leads process enhancements. As a leader, the
227
+ executive evaluates team performance and provides coaching and mentoring to planning
228
+ staff. The position demands strong statistical, forecasting, and analytical expertise
229
+ to anticipate planning and resource demands and fosters effective collaboration
230
+ among internal and external partners to maximize planning efficiencies.
231
+ - The Supervisor (Passenger Services) oversees daily passenger service operations
232
+ to ensure compliance with established service quality benchmarks. Collaborating
233
+ closely with multiple departments, this role addresses intricate customer concerns
234
+ and conducts routine safety and security inspections to uphold a secure workplace.
235
+ Serving as a mentor, the Supervisor guides team members and handles conflict resolution,
236
+ grievances, and disputes within the team. A comprehensive knowledge of airport
237
+ and airline check-in protocols, as well as baggage handling system procedures,
238
+ is essential. Operating in shifts to support continuous flight schedules, the
239
+ Supervisor acts as a representative for the company’s service standards. The role
240
+ demands strong communication, interpersonal, customer service, and leadership
241
+ abilities, with an aptitude for working effectively in a diverse, multicultural
242
+ environment.
243
+ - The Senior Procurement Executive manages the acquisition of goods and services,
244
+ negotiates supplier contracts, and oversees vendor relationships to support the
245
+ company’s purchasing needs. This role focuses on sourcing strategies, supplier
246
+ evaluations, cost analysis, and procurement compliance within the manufacturing
247
+ industry. The executive leads procurement planning, coordinates with finance and
248
+ operations teams, and ensures timely delivery of purchased materials. They are
249
+ responsible for maintaining supplier performance metrics, conducting market research,
250
+ and implementing procurement best practices. The role requires strong negotiation
251
+ skills, supplier risk management, and contract administration experience. As a
252
+ senior professional, the executive supervises procurement staff and drives continuous
253
+ improvement initiatives in sourcing processes.
254
+ datasets:
255
+ - dnth/ssf-train-valid-v4.2
256
+ pipeline_tag: sentence-similarity
257
+ library_name: sentence-transformers
258
+ ---
259
+
260
+ # SentenceTransformer based on nomic-ai/modernbert-embed-base
261
+
262
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [nomic-ai/modernbert-embed-base](https://huggingface.co/nomic-ai/modernbert-embed-base) on the [ssf-train-valid-v4.2](https://huggingface.co/datasets/dnth/ssf-train-valid-v4.2) dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
263
+
264
+ ## Model Details
265
+
266
+ ### Model Description
267
+ - **Model Type:** Sentence Transformer
268
+ - **Base model:** [nomic-ai/modernbert-embed-base](https://huggingface.co/nomic-ai/modernbert-embed-base) <!-- at revision d556a88e332558790b210f7bdbe87da2fa94a8d8 -->
269
+ - **Maximum Sequence Length:** 8192 tokens
270
+ - **Output Dimensionality:** 768 dimensions
271
+ - **Similarity Function:** Cosine Similarity
272
+ - **Training Dataset:**
273
+ - [ssf-train-valid-v4.2](https://huggingface.co/datasets/dnth/ssf-train-valid-v4.2)
274
+ <!-- - **Language:** Unknown -->
275
+ <!-- - **License:** Unknown -->
276
+
277
+ ### Model Sources
278
+
279
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
280
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
281
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
282
+
283
+ ### Full Model Architecture
284
+
285
+ ```
286
+ SentenceTransformer(
287
+ (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False, 'architecture': 'ModernBertModel'})
288
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
289
+ (2): Normalize()
290
+ )
291
+ ```
292
+
293
+ ## Usage
294
+
295
+ ### Direct Usage (Sentence Transformers)
296
+
297
+ First install the Sentence Transformers library:
298
+
299
+ ```bash
300
+ pip install -U sentence-transformers
301
+ ```
302
+
303
+ Then you can load this model and run inference.
304
+ ```python
305
+ from sentence_transformers import SentenceTransformer
306
+
307
+ # Download from the 🤗 Hub
308
+ model = SentenceTransformer("dnth/ssf-retriever-modernbert-embed-base-v4.2")
309
+ # Run inference
310
+ sentences = [
311
+ 'The Senior Manufacturing Planning Executive formulates production plans and organises materials, manpower and resources to accomplish manufacturing functions to fulfil customer and financial commitments. He/She validates the master production schedule (MPS) and drives adherence of manufacturing works to project schedules and goals in collaboration with cross-functional leads. He leads material requirements planning and programme reviews with relevant stakeholders. He is responsible for optimising supply chain and logistics planning, contract negotiations, vendor sourcing, inventory planning and control and warehousing operations to meet manufacturing requirements. He leverages data from supply chain management (SCM) systems to enhance decision-making and implements supplier capability development plans to enhance performance. He drives continuous improvements on product on-time delivery and total available man-hours, develops strategies and priorities for critical customer issues, facilitates problem-solving, leads in regular reviews with customers and suppliers, and establishes best practices on process improvements to enhance productivity. He proactively contributes to the development of lean and sustainability practices, and conducts research and digital innovation in targeted areas for continuous process improvements. As a team leader, he appraises staff performance and conducts coaching and mentoring for planning personnel. He should possess advanced statistical, forecasting and analytical skills to predict planning and resource requirements. He is able to drive cross-functional collaboration between internal and external stakeholders to optimise the planning processes and ensure maximum resource utilisation.',
312
+ 'The Senior Manufacturing Planning Executive develops and implements production schedules while coordinating materials, workforce, and resources to meet manufacturing targets aligned with customer demands and financial objectives. This role involves validating the master production schedule and ensuring manufacturing activities comply with project timelines through collaboration with various departments. The executive leads material planning and program assessments with key partners and is accountable for optimizing supply chain logistics, managing contracts, sourcing vendors, controlling inventory, and overseeing warehouse operations to support manufacturing needs. By utilizing supply chain management data, the executive enhances decision-making and drives supplier capability improvements. They champion continuous improvements in on-time delivery performance and labor efficiency, formulate strategies to address critical customer concerns, facilitate problem resolution, conduct stakeholder reviews, and promote best practices to boost productivity. Additionally, the role supports lean methodologies and sustainability initiatives, explores digital innovations, and leads process enhancements. As a leader, the executive evaluates team performance and provides coaching and mentoring to planning staff. The position demands strong statistical, forecasting, and analytical expertise to anticipate planning and resource demands and fosters effective collaboration among internal and external partners to maximize planning efficiencies.',
313
+ 'The Senior Procurement Executive manages the acquisition of goods and services, negotiates supplier contracts, and oversees vendor relationships to support the company’s purchasing needs. This role focuses on sourcing strategies, supplier evaluations, cost analysis, and procurement compliance within the manufacturing industry. The executive leads procurement planning, coordinates with finance and operations teams, and ensures timely delivery of purchased materials. They are responsible for maintaining supplier performance metrics, conducting market research, and implementing procurement best practices. The role requires strong negotiation skills, supplier risk management, and contract administration experience. As a senior professional, the executive supervises procurement staff and drives continuous improvement initiatives in sourcing processes.',
314
+ ]
315
+ embeddings = model.encode(sentences)
316
+ print(embeddings.shape)
317
+ # [3, 768]
318
+
319
+ # Get the similarity scores for the embeddings
320
+ similarities = model.similarity(embeddings, embeddings)
321
+ print(similarities)
322
+ # tensor([[1.0000, 0.9240, 0.5197],
323
+ # [0.9240, 1.0000, 0.5085],
324
+ # [0.5197, 0.5085, 1.0000]])
325
+ ```
326
+
327
+ <!--
328
+ ### Direct Usage (Transformers)
329
+
330
+ <details><summary>Click to see the direct usage in Transformers</summary>
331
+
332
+ </details>
333
+ -->
334
+
335
+ <!--
336
+ ### Downstream Usage (Sentence Transformers)
337
+
338
+ You can finetune this model on your own dataset.
339
+
340
+ <details><summary>Click to expand</summary>
341
+
342
+ </details>
343
+ -->
344
+
345
+ <!--
346
+ ### Out-of-Scope Use
347
+
348
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
349
+ -->
350
+
351
+ <!--
352
+ ## Bias, Risks and Limitations
353
+
354
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
355
+ -->
356
+
357
+ <!--
358
+ ### Recommendations
359
+
360
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
361
+ -->
362
+
363
+ ## Training Details
364
+
365
+ ### Training Dataset
366
+
367
+ #### ssf-train-valid-v4.2
368
+
369
+ * Dataset: [ssf-train-valid-v4.2](https://huggingface.co/datasets/dnth/ssf-train-valid-v4.2) at [97c8b4d](https://huggingface.co/datasets/dnth/ssf-train-valid-v4.2/tree/97c8b4d3dc96a480e369838fb9f00464ce9080e9)
370
+ * Size: 7,540 training samples
371
+ * Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
372
+ * Approximate statistics based on the first 1000 samples:
373
+ | | anchor | positive | negative |
374
+ |:--------|:-------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|
375
+ | type | string | string | string |
376
+ | details | <ul><li>min: 58 tokens</li><li>mean: 167.85 tokens</li><li>max: 355 tokens</li></ul> | <ul><li>min: 58 tokens</li><li>mean: 138.3 tokens</li><li>max: 293 tokens</li></ul> | <ul><li>min: 50 tokens</li><li>mean: 108.71 tokens</li><li>max: 249 tokens</li></ul> |
377
+ * Samples:
378
+ | anchor | positive | negative |
379
+ |:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
380
+ | <code>The Manufacturing Engineer/Production Engineer (Assembly) develops detailed operation and specification sheets throughout the assembly cycle. He/She coordinates shop floor operations and process control, and plans resources to meet production targets. He is conversant with tools and fixtures design and computer integrated manufacturing (CIM) technologies. He determines appropriate resources and processes for engineering application while ensuring working conditions of assembly equipment and machinery. He also manages assembly techniques and verifies conformance of new aircraft components and parts to specifications. He ensures adherence of assembly operations to legislative and airworthiness requirements, as well as with the organisation's standard operating procedures (SOPs), safety, health and quality systems. He identifies opportunities for continuous improvement through data analytics, research and innovation, and implements lean and sustainability practices in assembly. He monitor...</code> | <code>The Manufacturing Engineer (Assembly) is responsible for creating detailed operation and specification documentation for the assembly process. This role involves coordinating shop floor activities and overseeing process controls while managing resource planning to achieve production goals. The engineer applies expertise in tooling and fixture design alongside computer integrated manufacturing (CIM) technologies to determine suitable resources and processes for engineering tasks. Ensuring optimal working conditions for assembly machinery and equipment, the engineer supervises assembly methods and confirms that new aircraft parts meet stringent specification requirements. Compliance with legislative, airworthiness, and organizational standard operating procedures (SOPs), as well as health, safety, and quality management systems, is rigorously maintained. The role focuses on identifying and implementing continuous improvement initiatives through data analysis, innovation, and lean manufac...</code> | <code>The Manufacturing Engineer (Quality Assurance) oversees the inspection and testing of finished products to ensure compliance with quality standards. This role manages quality control procedures throughout the production cycle but does not directly engage in assembly or shop floor coordination. The engineer utilises statistical process control and quality management systems to monitor product conformity, focusing on defect reduction and customer satisfaction. Responsibilities include conducting audits, documenting non-conformance issues, and recommending corrective actions aligned with regulatory requirements and internal policies. While familiar with manufacturing tools and technologies, this position emphasizes quality assurance processes rather than resource planning or tooling design. The engineer collaborates with cross-functional teams to improve product reliability and supports training initiatives on quality standards. Strong analytical skills and attention to detail are necessa...</code> |
381
+ | <code>The Linen Room Attendant/Laundry Valet Attendant performs daily assigned duties to support the day-to-day laundry, linen and uniform room operations, ensuring the delivery of clean garments, uniforms, towels and linens to all internal and external customers. He/She collects and delivers guest laundry, performs laundry cleaning, sorts and issues linens and uniforms, and assists in inventory count. He also cleans and maintains laundry equipment and the work area. As part of service delivery, the Linen Room Attendant/Laundry Valet Attendant has to handle guests' requests and respond to their concerns and feedback in a professional and courteous manner. He complies with organisational guidelines and regulations on hygiene and workplace safety and health, and reports safety hazards observed to ensure workplace safety and security. He is a team player with a high level of attentiveness to details and good communication skills to interact with guests and all levels of staff. He works on shift...</code> | <code>The Linen Room Attendant/Laundry Valet Attendant is responsible for supporting daily operations in the laundry, linen, and uniform rooms by ensuring prompt and efficient delivery of cleaned garments, towels, uniforms, and linens to both internal departments and external guests. This role involves collecting and returning guest laundry, sorting and distributing linens and uniforms, conducting inventory checks, and maintaining cleanliness of laundry equipment and workspaces. The attendant addresses guest inquiries and concerns professionally and courteously, adheres to hygiene and workplace safety standards, and promptly reports any safety issues. This position requires teamwork, attention to detail, effective communication skills, and physical stamina to handle tasks such as standing, walking, and lifting heavy laundry loads throughout shifts that may include weekends and public holidays.</code> | <code>The Linen Room Supervisor oversees the strategic planning and management of laundry services within a hotel, leading a team of attendants and coordinating with multiple departments to optimize operational efficiency. This senior role involves budgeting, staff training, and implementing quality control measures rather than performing hands-on laundry tasks. The supervisor is responsible for developing service standards, managing vendor relationships, and ensuring compliance with corporate policies, with minimal direct involvement in daily linen sorting or equipment maintenance. Strong leadership, decision-making capabilities, and experience in workforce management are essential, while physical demands are limited compared to frontline laundry roles.</code> |
382
+ | <code>The General Worker / Operator performs general duties, and cleaning and housekeeping tasks as assigned. He/She is required to assist in operating machinery under supervision and moving aircraft components, equipment and materials from the store to respective work areas. He is expected to adhere to the organisation's standard operating procedures (SOPs), and safety, health and quality systems. He supports in implementation of continuous improvement initiatives to ensure workspace efficiency and effectiveness. He works in a hangar or workshop and may be required to work in shifts. He should be comfortable with repetitive work activities and exposure to physically demanding work conditions such as long standing hours and extreme temperatures.</code> | <code>The General Worker / Operator is responsible for carrying out various general tasks including cleaning and housekeeping duties as directed. This role involves assisting with machinery operation under guidance and transporting aircraft parts, equipment, and supplies from storage to designated work areas. The incumbent must strictly follow the company’s standard operating procedures, along with safety, health, and quality protocols. They contribute to continuous improvement efforts aimed at enhancing workspace productivity and efficiency. The position is based in a hangar or workshop environment and may require shift work. The ideal candidate should be able to handle repetitive tasks and endure physically challenging conditions such as prolonged standing and exposure to temperature extremes.</code> | <code>The Warehouse Clerk manages inventory records and coordinates the receipt and dispatch of goods within the logistics sector. This role requires proficiency in inventory management software and strong organizational skills to maintain stock accuracy. The Warehouse Clerk operates in a distribution center and collaborates closely with supply chain teams to ensure timely delivery schedules. The position demands attention to detail and the ability to work under pressure but does not involve machinery operation or physically strenuous activities common in manufacturing environments.</code> |
383
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
384
+ ```json
385
+ {
386
+ "scale": 20.0,
387
+ "similarity_fct": "cos_sim",
388
+ "gather_across_devices": false
389
+ }
390
+ ```
391
+
392
+ ### Evaluation Dataset
393
+
394
+ #### ssf-train-valid-v4.2
395
+
396
+ * Dataset: [ssf-train-valid-v4.2](https://huggingface.co/datasets/dnth/ssf-train-valid-v4.2) at [97c8b4d](https://huggingface.co/datasets/dnth/ssf-train-valid-v4.2/tree/97c8b4d3dc96a480e369838fb9f00464ce9080e9)
397
+ * Size: 1,885 evaluation samples
398
+ * Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
399
+ * Approximate statistics based on the first 1000 samples:
400
+ | | anchor | positive | negative |
401
+ |:--------|:-------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|
402
+ | type | string | string | string |
403
+ | details | <ul><li>min: 60 tokens</li><li>mean: 170.26 tokens</li><li>max: 403 tokens</li></ul> | <ul><li>min: 59 tokens</li><li>mean: 138.72 tokens</li><li>max: 265 tokens</li></ul> | <ul><li>min: 50 tokens</li><li>mean: 109.98 tokens</li><li>max: 252 tokens</li></ul> |
404
+ * Samples:
405
+ | anchor | positive | negative |
406
+ |:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
407
+ | <code>The Technician (Signal and Communications) works in a team to perform preventive and corrective maintenance of signal, communication and control systems, to improve the reliability of signal, communication and control systems. He/She assists in the preparation of maintenance activities and is technically inclined and adept in handling electronics and computer-based systems and equipment for maintenance. He also supervises the work of contractors and external stakeholders in ensuring adherence to operating requirements and safety standards. He may be required to perform shift duties at various rail premises such as workshops, depots, train stations, and train tunnels. He is capable of communicating effectively within the team, is able to multi-task and can prioritises his assigned maintenance workload in supporting maintenance activities.</code> | <code>The Technician (Signal and Communications) collaborates within a team to conduct routine and emergency maintenance on signal, communication, and control infrastructures, aiming to enhance system reliability. This role involves assisting in the planning of maintenance operations and requires strong technical skills in electronics and computer-based maintenance tools. The technician oversees contractors and external partners to ensure compliance with operational protocols and safety guidelines. Shift work at various rail facilities, including workshops, depots, stations, and tunnels, may be necessary. Effective team communication, multitasking abilities, and prioritization of maintenance tasks are essential to support ongoing maintenance efforts.</code> | <code>The Technician (Electrical Installations) is responsible for installing and testing electrical wiring and equipment in residential and commercial buildings. They prepare site layouts, follow electrical codes, and ensure safety during installation processes. The technician coordinates with suppliers and clients but does not engage in signal or communication system maintenance. Shift work is generally not required, and the role focuses on hands-on installation rather than supervising external contractors. Strong knowledge of electrical wiring, circuit breakers, and household electrical standards is necessary, along with good communication skills to liaise with homeowners and site managers.</code> |
408
+ | <code>The Visual Merchandiser manages shopper marketing activities and is responsible for the conceptualisation of the visual merchandising plans. He/she oversees the set-up of merchandise display by coaching in-store teams. He is also responsible for market research efforts relating to visual merchandising. He operates in a fast-paced and creative environment where he conceptualises eye-catching product displays, store layouts and designs to promote the store's products. He is creative, detail-oriented and is effective working within tight deadlines. He is able to effectively prioritise multiple assignments and possesses an aesthetic flair.</code> | <code>The Visual Merchandiser is responsible for planning and executing shopper marketing strategies through innovative visual displays. This role involves guiding retail teams in arranging merchandise presentations and ensuring the store environment is appealing and aligned with brand standards. The Visual Merchandiser conducts market research to stay updated on trends and consumer preferences, working in a dynamic, fast-paced setting that demands creativity and precision. Strong organizational skills and an eye for design are essential to manage multiple projects and deliver compelling store layouts that enhance customer engagement.</code> | <code>The Visual Merchandiser leads the digital marketing campaigns for retail brands, focusing on online shopper engagement and social media promotions. He/she develops content strategies, coordinates with creative teams, and analyses ecommerce data to optimise product visibility. Operating in a technology-driven environment, the Visual Merchandiser applies analytical skills and marketing knowledge to influence buying behaviour through digital channels rather than physical displays. This role requires proficiency in digital tools and a strong understanding of consumer analytics rather than traditional visual merchandising techniques.</code> |
409
+ | <code>The Network Development Technician implements gas transmission and/or distribution network development projects and monitors site activities. He/She supports the preparation of construction activity records, project progress reports and materials required for payments. He also liaises with contractors and customers to carry out metering works and performs the installation, testing and commissioning of residential meters. He applies Safe System of Work (SSoW) procedures and risk control measures to ensure work activities are carried out safely, and in compliance with Workplace Safety and Health (WSH) Act. He is a member of the Emergency Response Team and follows emergency response plans and relevant safety procedures. He occasionally works at construction sites for the gas transmission and/or distribution network development projects. He is a good team player who collaborates and communicates effectively with key stakeholders. He is detailed in ensuring that operations are carried out a...</code> | <code>The Network Development Technician is responsible for executing gas transmission and distribution network expansion initiatives while overseeing on-site operations. This role involves assisting in the documentation of construction activities, compiling project status updates, and coordinating materials for billing purposes. The technician interacts with contractors and clients to facilitate metering installations, including the testing and commissioning of residential gas meters. Adherence to Safe System of Work protocols and risk mitigation strategies is essential to maintain compliance with the Workplace Safety and Health Act. As an integral member of the Emergency Response Team, the technician follows prescribed emergency procedures and safety guidelines. Fieldwork at construction locations is periodically required. Strong teamwork, clear communication with stakeholders, and meticulous attention to procedural compliance are key attributes for success in this role.</code> | <code>The Network Operations Coordinator oversees the scheduling and administration of telecommunications network services, ensuring seamless connectivity and customer satisfaction. This position requires coordinating with service providers and vendors to manage infrastructure upgrades and maintenance tasks. The coordinator prepares operational reports and assists with billing reconciliations. Familiarity with IT systems and network management software is essential, alongside strong communication skills to liaise with internal teams and external partners. While safety protocols are observed, the role primarily focuses on service delivery rather than physical installation or emergency response activities. The coordinator works mainly in an office environment and supports multiple projects simultaneously without direct involvement in gas transmission or distribution networks.</code> |
410
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
411
+ ```json
412
+ {
413
+ "scale": 20.0,
414
+ "similarity_fct": "cos_sim",
415
+ "gather_across_devices": false
416
+ }
417
+ ```
418
+
419
+ ### Training Hyperparameters
420
+ #### Non-Default Hyperparameters
421
+
422
+ - `eval_strategy`: epoch
423
+ - `per_device_train_batch_size`: 16
424
+ - `per_device_eval_batch_size`: 32
425
+ - `gradient_accumulation_steps`: 32
426
+ - `learning_rate`: 2e-05
427
+ - `weight_decay`: 0.01
428
+ - `num_train_epochs`: 5
429
+ - `lr_scheduler_type`: cosine
430
+ - `warmup_ratio`: 0.1
431
+ - `bf16`: True
432
+ - `tf32`: True
433
+ - `load_best_model_at_end`: True
434
+ - `gradient_checkpointing`: True
435
+ - `batch_sampler`: no_duplicates
436
+
437
+ #### All Hyperparameters
438
+ <details><summary>Click to expand</summary>
439
+
440
+ - `overwrite_output_dir`: False
441
+ - `do_predict`: False
442
+ - `eval_strategy`: epoch
443
+ - `prediction_loss_only`: True
444
+ - `per_device_train_batch_size`: 16
445
+ - `per_device_eval_batch_size`: 32
446
+ - `per_gpu_train_batch_size`: None
447
+ - `per_gpu_eval_batch_size`: None
448
+ - `gradient_accumulation_steps`: 32
449
+ - `eval_accumulation_steps`: None
450
+ - `torch_empty_cache_steps`: None
451
+ - `learning_rate`: 2e-05
452
+ - `weight_decay`: 0.01
453
+ - `adam_beta1`: 0.9
454
+ - `adam_beta2`: 0.999
455
+ - `adam_epsilon`: 1e-08
456
+ - `max_grad_norm`: 1.0
457
+ - `num_train_epochs`: 5
458
+ - `max_steps`: -1
459
+ - `lr_scheduler_type`: cosine
460
+ - `lr_scheduler_kwargs`: {}
461
+ - `warmup_ratio`: 0.1
462
+ - `warmup_steps`: 0
463
+ - `log_level`: passive
464
+ - `log_level_replica`: warning
465
+ - `log_on_each_node`: True
466
+ - `logging_nan_inf_filter`: True
467
+ - `save_safetensors`: True
468
+ - `save_on_each_node`: False
469
+ - `save_only_model`: False
470
+ - `restore_callback_states_from_checkpoint`: False
471
+ - `no_cuda`: False
472
+ - `use_cpu`: False
473
+ - `use_mps_device`: False
474
+ - `seed`: 42
475
+ - `data_seed`: None
476
+ - `jit_mode_eval`: False
477
+ - `use_ipex`: False
478
+ - `bf16`: True
479
+ - `fp16`: False
480
+ - `fp16_opt_level`: O1
481
+ - `half_precision_backend`: auto
482
+ - `bf16_full_eval`: False
483
+ - `fp16_full_eval`: False
484
+ - `tf32`: True
485
+ - `local_rank`: 0
486
+ - `ddp_backend`: None
487
+ - `tpu_num_cores`: None
488
+ - `tpu_metrics_debug`: False
489
+ - `debug`: []
490
+ - `dataloader_drop_last`: False
491
+ - `dataloader_num_workers`: 0
492
+ - `dataloader_prefetch_factor`: None
493
+ - `past_index`: -1
494
+ - `disable_tqdm`: False
495
+ - `remove_unused_columns`: True
496
+ - `label_names`: None
497
+ - `load_best_model_at_end`: True
498
+ - `ignore_data_skip`: False
499
+ - `fsdp`: []
500
+ - `fsdp_min_num_params`: 0
501
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
502
+ - `fsdp_transformer_layer_cls_to_wrap`: None
503
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
504
+ - `deepspeed`: None
505
+ - `label_smoothing_factor`: 0.0
506
+ - `optim`: adamw_torch_fused
507
+ - `optim_args`: None
508
+ - `adafactor`: False
509
+ - `group_by_length`: False
510
+ - `length_column_name`: length
511
+ - `ddp_find_unused_parameters`: None
512
+ - `ddp_bucket_cap_mb`: None
513
+ - `ddp_broadcast_buffers`: False
514
+ - `dataloader_pin_memory`: True
515
+ - `dataloader_persistent_workers`: False
516
+ - `skip_memory_metrics`: True
517
+ - `use_legacy_prediction_loop`: False
518
+ - `push_to_hub`: False
519
+ - `resume_from_checkpoint`: None
520
+ - `hub_model_id`: None
521
+ - `hub_strategy`: every_save
522
+ - `hub_private_repo`: None
523
+ - `hub_always_push`: False
524
+ - `hub_revision`: None
525
+ - `gradient_checkpointing`: True
526
+ - `gradient_checkpointing_kwargs`: None
527
+ - `include_inputs_for_metrics`: False
528
+ - `include_for_metrics`: []
529
+ - `eval_do_concat_batches`: True
530
+ - `fp16_backend`: auto
531
+ - `push_to_hub_model_id`: None
532
+ - `push_to_hub_organization`: None
533
+ - `mp_parameters`:
534
+ - `auto_find_batch_size`: False
535
+ - `full_determinism`: False
536
+ - `torchdynamo`: None
537
+ - `ray_scope`: last
538
+ - `ddp_timeout`: 1800
539
+ - `torch_compile`: False
540
+ - `torch_compile_backend`: None
541
+ - `torch_compile_mode`: None
542
+ - `include_tokens_per_second`: False
543
+ - `include_num_input_tokens_seen`: False
544
+ - `neftune_noise_alpha`: None
545
+ - `optim_target_modules`: None
546
+ - `batch_eval_metrics`: False
547
+ - `eval_on_start`: False
548
+ - `use_liger_kernel`: False
549
+ - `liger_kernel_config`: None
550
+ - `eval_use_gather_object`: False
551
+ - `average_tokens_across_devices`: False
552
+ - `prompts`: None
553
+ - `batch_sampler`: no_duplicates
554
+ - `multi_dataset_batch_sampler`: proportional
555
+ - `router_mapping`: {}
556
+ - `learning_rate_mapping`: {}
557
+
558
+ </details>
559
+
560
+ ### Training Logs
561
+ | Epoch | Step | Training Loss | Validation Loss |
562
+ |:-------:|:------:|:-------------:|:---------------:|
563
+ | 0.3390 | 5 | 0.163 | - |
564
+ | 0.6780 | 10 | 0.0257 | - |
565
+ | 1.0 | 15 | 0.0048 | 0.0057 |
566
+ | 1.3390 | 20 | 0.0031 | - |
567
+ | 1.6780 | 25 | 0.0021 | - |
568
+ | 2.0 | 30 | 0.0015 | 0.0027 |
569
+ | 2.3390 | 35 | 0.0021 | - |
570
+ | 2.6780 | 40 | 0.0023 | - |
571
+ | 3.0 | 45 | 0.001 | 0.0017 |
572
+ | 3.3390 | 50 | 0.0013 | - |
573
+ | 3.6780 | 55 | 0.0014 | - |
574
+ | 4.0 | 60 | 0.0013 | 0.0015 |
575
+ | 4.3390 | 65 | 0.0013 | - |
576
+ | 4.6780 | 70 | 0.001 | - |
577
+ | **5.0** | **75** | **0.0018** | **0.0015** |
578
+
579
+ * The bold row denotes the saved checkpoint.
580
+
581
+ ### Framework Versions
582
+ - Python: 3.12.8
583
+ - Sentence Transformers: 5.1.0
584
+ - Transformers: 4.55.0
585
+ - PyTorch: 2.8.0+cu128
586
+ - Accelerate: 1.10.0
587
+ - Datasets: 4.0.0
588
+ - Tokenizers: 0.21.4
589
+
590
+ ## Citation
591
+
592
+ ### BibTeX
593
+
594
+ #### Sentence Transformers
595
+ ```bibtex
596
+ @inproceedings{reimers-2019-sentence-bert,
597
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
598
+ author = "Reimers, Nils and Gurevych, Iryna",
599
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
600
+ month = "11",
601
+ year = "2019",
602
+ publisher = "Association for Computational Linguistics",
603
+ url = "https://arxiv.org/abs/1908.10084",
604
+ }
605
+ ```
606
+
607
+ #### MultipleNegativesRankingLoss
608
+ ```bibtex
609
+ @misc{henderson2017efficient,
610
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
611
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
612
+ year={2017},
613
+ eprint={1705.00652},
614
+ archivePrefix={arXiv},
615
+ primaryClass={cs.CL}
616
+ }
617
+ ```
618
+
619
+ <!--
620
+ ## Glossary
621
+
622
+ *Clearly define terms in order to be accessible across audiences.*
623
+ -->
624
+
625
+ <!--
626
+ ## Model Card Authors
627
+
628
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
629
+ -->
630
+
631
+ <!--
632
+ ## Model Card Contact
633
+
634
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
635
+ -->
config.json ADDED
@@ -0,0 +1,45 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "ModernBertModel"
4
+ ],
5
+ "attention_bias": false,
6
+ "attention_dropout": 0.0,
7
+ "bos_token_id": 50281,
8
+ "classifier_activation": "gelu",
9
+ "classifier_bias": false,
10
+ "classifier_dropout": 0.0,
11
+ "classifier_pooling": "mean",
12
+ "cls_token_id": 50281,
13
+ "decoder_bias": true,
14
+ "deterministic_flash_attn": false,
15
+ "embedding_dropout": 0.0,
16
+ "eos_token_id": 50282,
17
+ "global_attn_every_n_layers": 3,
18
+ "global_rope_theta": 160000.0,
19
+ "gradient_checkpointing": false,
20
+ "hidden_activation": "gelu",
21
+ "hidden_size": 768,
22
+ "initializer_cutoff_factor": 2.0,
23
+ "initializer_range": 0.02,
24
+ "intermediate_size": 1152,
25
+ "layer_norm_eps": 1e-05,
26
+ "local_attention": 128,
27
+ "local_rope_theta": 10000.0,
28
+ "max_position_embeddings": 8192,
29
+ "mlp_bias": false,
30
+ "mlp_dropout": 0.0,
31
+ "model_type": "modernbert",
32
+ "norm_bias": false,
33
+ "norm_eps": 1e-05,
34
+ "num_attention_heads": 12,
35
+ "num_hidden_layers": 22,
36
+ "pad_token_id": 50283,
37
+ "position_embedding_type": "absolute",
38
+ "repad_logits_with_grad": false,
39
+ "sep_token_id": 50282,
40
+ "sparse_pred_ignore_index": -100,
41
+ "sparse_prediction": false,
42
+ "torch_dtype": "float32",
43
+ "transformers_version": "4.55.0",
44
+ "vocab_size": 50368
45
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "5.1.0",
4
+ "transformers": "4.55.0",
5
+ "pytorch": "2.8.0+cu128"
6
+ },
7
+ "prompts": {
8
+ "query": "",
9
+ "document": ""
10
+ },
11
+ "default_prompt_name": null,
12
+ "similarity_fn_name": "cosine",
13
+ "model_type": "SentenceTransformer"
14
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:89ad01eccede8d8b76c86760bce1f2ce69a6181afefa59a8bd18d3c6e9f0cec0
3
+ size 596070136
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 8192,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": true,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,945 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "|||IP_ADDRESS|||",
5
+ "lstrip": false,
6
+ "normalized": true,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": false
10
+ },
11
+ "1": {
12
+ "content": "<|padding|>",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "50254": {
20
+ "content": " ",
21
+ "lstrip": false,
22
+ "normalized": true,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": false
26
+ },
27
+ "50255": {
28
+ "content": " ",
29
+ "lstrip": false,
30
+ "normalized": true,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": false
34
+ },
35
+ "50256": {
36
+ "content": " ",
37
+ "lstrip": false,
38
+ "normalized": true,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": false
42
+ },
43
+ "50257": {
44
+ "content": " ",
45
+ "lstrip": false,
46
+ "normalized": true,
47
+ "rstrip": false,
48
+ "single_word": false,
49
+ "special": false
50
+ },
51
+ "50258": {
52
+ "content": " ",
53
+ "lstrip": false,
54
+ "normalized": true,
55
+ "rstrip": false,
56
+ "single_word": false,
57
+ "special": false
58
+ },
59
+ "50259": {
60
+ "content": " ",
61
+ "lstrip": false,
62
+ "normalized": true,
63
+ "rstrip": false,
64
+ "single_word": false,
65
+ "special": false
66
+ },
67
+ "50260": {
68
+ "content": " ",
69
+ "lstrip": false,
70
+ "normalized": true,
71
+ "rstrip": false,
72
+ "single_word": false,
73
+ "special": false
74
+ },
75
+ "50261": {
76
+ "content": " ",
77
+ "lstrip": false,
78
+ "normalized": true,
79
+ "rstrip": false,
80
+ "single_word": false,
81
+ "special": false
82
+ },
83
+ "50262": {
84
+ "content": " ",
85
+ "lstrip": false,
86
+ "normalized": true,
87
+ "rstrip": false,
88
+ "single_word": false,
89
+ "special": false
90
+ },
91
+ "50263": {
92
+ "content": " ",
93
+ "lstrip": false,
94
+ "normalized": true,
95
+ "rstrip": false,
96
+ "single_word": false,
97
+ "special": false
98
+ },
99
+ "50264": {
100
+ "content": " ",
101
+ "lstrip": false,
102
+ "normalized": true,
103
+ "rstrip": false,
104
+ "single_word": false,
105
+ "special": false
106
+ },
107
+ "50265": {
108
+ "content": " ",
109
+ "lstrip": false,
110
+ "normalized": true,
111
+ "rstrip": false,
112
+ "single_word": false,
113
+ "special": false
114
+ },
115
+ "50266": {
116
+ "content": " ",
117
+ "lstrip": false,
118
+ "normalized": true,
119
+ "rstrip": false,
120
+ "single_word": false,
121
+ "special": false
122
+ },
123
+ "50267": {
124
+ "content": " ",
125
+ "lstrip": false,
126
+ "normalized": true,
127
+ "rstrip": false,
128
+ "single_word": false,
129
+ "special": false
130
+ },
131
+ "50268": {
132
+ "content": " ",
133
+ "lstrip": false,
134
+ "normalized": true,
135
+ "rstrip": false,
136
+ "single_word": false,
137
+ "special": false
138
+ },
139
+ "50269": {
140
+ "content": " ",
141
+ "lstrip": false,
142
+ "normalized": true,
143
+ "rstrip": false,
144
+ "single_word": false,
145
+ "special": false
146
+ },
147
+ "50270": {
148
+ "content": " ",
149
+ "lstrip": false,
150
+ "normalized": true,
151
+ "rstrip": false,
152
+ "single_word": false,
153
+ "special": false
154
+ },
155
+ "50271": {
156
+ "content": " ",
157
+ "lstrip": false,
158
+ "normalized": true,
159
+ "rstrip": false,
160
+ "single_word": false,
161
+ "special": false
162
+ },
163
+ "50272": {
164
+ "content": " ",
165
+ "lstrip": false,
166
+ "normalized": true,
167
+ "rstrip": false,
168
+ "single_word": false,
169
+ "special": false
170
+ },
171
+ "50273": {
172
+ "content": " ",
173
+ "lstrip": false,
174
+ "normalized": true,
175
+ "rstrip": false,
176
+ "single_word": false,
177
+ "special": false
178
+ },
179
+ "50274": {
180
+ "content": " ",
181
+ "lstrip": false,
182
+ "normalized": true,
183
+ "rstrip": false,
184
+ "single_word": false,
185
+ "special": false
186
+ },
187
+ "50275": {
188
+ "content": " ",
189
+ "lstrip": false,
190
+ "normalized": true,
191
+ "rstrip": false,
192
+ "single_word": false,
193
+ "special": false
194
+ },
195
+ "50276": {
196
+ "content": " ",
197
+ "lstrip": false,
198
+ "normalized": true,
199
+ "rstrip": false,
200
+ "single_word": false,
201
+ "special": false
202
+ },
203
+ "50277": {
204
+ "content": "|||EMAIL_ADDRESS|||",
205
+ "lstrip": false,
206
+ "normalized": true,
207
+ "rstrip": false,
208
+ "single_word": false,
209
+ "special": false
210
+ },
211
+ "50278": {
212
+ "content": "|||PHONE_NUMBER|||",
213
+ "lstrip": false,
214
+ "normalized": true,
215
+ "rstrip": false,
216
+ "single_word": false,
217
+ "special": false
218
+ },
219
+ "50279": {
220
+ "content": "<|endoftext|>",
221
+ "lstrip": false,
222
+ "normalized": false,
223
+ "rstrip": false,
224
+ "single_word": false,
225
+ "special": true
226
+ },
227
+ "50280": {
228
+ "content": "[UNK]",
229
+ "lstrip": false,
230
+ "normalized": false,
231
+ "rstrip": false,
232
+ "single_word": false,
233
+ "special": true
234
+ },
235
+ "50281": {
236
+ "content": "[CLS]",
237
+ "lstrip": false,
238
+ "normalized": false,
239
+ "rstrip": false,
240
+ "single_word": false,
241
+ "special": true
242
+ },
243
+ "50282": {
244
+ "content": "[SEP]",
245
+ "lstrip": false,
246
+ "normalized": false,
247
+ "rstrip": false,
248
+ "single_word": false,
249
+ "special": true
250
+ },
251
+ "50283": {
252
+ "content": "[PAD]",
253
+ "lstrip": false,
254
+ "normalized": false,
255
+ "rstrip": false,
256
+ "single_word": false,
257
+ "special": true
258
+ },
259
+ "50284": {
260
+ "content": "[MASK]",
261
+ "lstrip": true,
262
+ "normalized": false,
263
+ "rstrip": false,
264
+ "single_word": false,
265
+ "special": true
266
+ },
267
+ "50285": {
268
+ "content": "[unused0]",
269
+ "lstrip": false,
270
+ "normalized": true,
271
+ "rstrip": false,
272
+ "single_word": false,
273
+ "special": false
274
+ },
275
+ "50286": {
276
+ "content": "[unused1]",
277
+ "lstrip": false,
278
+ "normalized": true,
279
+ "rstrip": false,
280
+ "single_word": false,
281
+ "special": false
282
+ },
283
+ "50287": {
284
+ "content": "[unused2]",
285
+ "lstrip": false,
286
+ "normalized": true,
287
+ "rstrip": false,
288
+ "single_word": false,
289
+ "special": false
290
+ },
291
+ "50288": {
292
+ "content": "[unused3]",
293
+ "lstrip": false,
294
+ "normalized": true,
295
+ "rstrip": false,
296
+ "single_word": false,
297
+ "special": false
298
+ },
299
+ "50289": {
300
+ "content": "[unused4]",
301
+ "lstrip": false,
302
+ "normalized": true,
303
+ "rstrip": false,
304
+ "single_word": false,
305
+ "special": false
306
+ },
307
+ "50290": {
308
+ "content": "[unused5]",
309
+ "lstrip": false,
310
+ "normalized": true,
311
+ "rstrip": false,
312
+ "single_word": false,
313
+ "special": false
314
+ },
315
+ "50291": {
316
+ "content": "[unused6]",
317
+ "lstrip": false,
318
+ "normalized": true,
319
+ "rstrip": false,
320
+ "single_word": false,
321
+ "special": false
322
+ },
323
+ "50292": {
324
+ "content": "[unused7]",
325
+ "lstrip": false,
326
+ "normalized": true,
327
+ "rstrip": false,
328
+ "single_word": false,
329
+ "special": false
330
+ },
331
+ "50293": {
332
+ "content": "[unused8]",
333
+ "lstrip": false,
334
+ "normalized": true,
335
+ "rstrip": false,
336
+ "single_word": false,
337
+ "special": false
338
+ },
339
+ "50294": {
340
+ "content": "[unused9]",
341
+ "lstrip": false,
342
+ "normalized": true,
343
+ "rstrip": false,
344
+ "single_word": false,
345
+ "special": false
346
+ },
347
+ "50295": {
348
+ "content": "[unused10]",
349
+ "lstrip": false,
350
+ "normalized": true,
351
+ "rstrip": false,
352
+ "single_word": false,
353
+ "special": false
354
+ },
355
+ "50296": {
356
+ "content": "[unused11]",
357
+ "lstrip": false,
358
+ "normalized": true,
359
+ "rstrip": false,
360
+ "single_word": false,
361
+ "special": false
362
+ },
363
+ "50297": {
364
+ "content": "[unused12]",
365
+ "lstrip": false,
366
+ "normalized": true,
367
+ "rstrip": false,
368
+ "single_word": false,
369
+ "special": false
370
+ },
371
+ "50298": {
372
+ "content": "[unused13]",
373
+ "lstrip": false,
374
+ "normalized": true,
375
+ "rstrip": false,
376
+ "single_word": false,
377
+ "special": false
378
+ },
379
+ "50299": {
380
+ "content": "[unused14]",
381
+ "lstrip": false,
382
+ "normalized": true,
383
+ "rstrip": false,
384
+ "single_word": false,
385
+ "special": false
386
+ },
387
+ "50300": {
388
+ "content": "[unused15]",
389
+ "lstrip": false,
390
+ "normalized": true,
391
+ "rstrip": false,
392
+ "single_word": false,
393
+ "special": false
394
+ },
395
+ "50301": {
396
+ "content": "[unused16]",
397
+ "lstrip": false,
398
+ "normalized": true,
399
+ "rstrip": false,
400
+ "single_word": false,
401
+ "special": false
402
+ },
403
+ "50302": {
404
+ "content": "[unused17]",
405
+ "lstrip": false,
406
+ "normalized": true,
407
+ "rstrip": false,
408
+ "single_word": false,
409
+ "special": false
410
+ },
411
+ "50303": {
412
+ "content": "[unused18]",
413
+ "lstrip": false,
414
+ "normalized": true,
415
+ "rstrip": false,
416
+ "single_word": false,
417
+ "special": false
418
+ },
419
+ "50304": {
420
+ "content": "[unused19]",
421
+ "lstrip": false,
422
+ "normalized": true,
423
+ "rstrip": false,
424
+ "single_word": false,
425
+ "special": false
426
+ },
427
+ "50305": {
428
+ "content": "[unused20]",
429
+ "lstrip": false,
430
+ "normalized": true,
431
+ "rstrip": false,
432
+ "single_word": false,
433
+ "special": false
434
+ },
435
+ "50306": {
436
+ "content": "[unused21]",
437
+ "lstrip": false,
438
+ "normalized": true,
439
+ "rstrip": false,
440
+ "single_word": false,
441
+ "special": false
442
+ },
443
+ "50307": {
444
+ "content": "[unused22]",
445
+ "lstrip": false,
446
+ "normalized": true,
447
+ "rstrip": false,
448
+ "single_word": false,
449
+ "special": false
450
+ },
451
+ "50308": {
452
+ "content": "[unused23]",
453
+ "lstrip": false,
454
+ "normalized": true,
455
+ "rstrip": false,
456
+ "single_word": false,
457
+ "special": false
458
+ },
459
+ "50309": {
460
+ "content": "[unused24]",
461
+ "lstrip": false,
462
+ "normalized": true,
463
+ "rstrip": false,
464
+ "single_word": false,
465
+ "special": false
466
+ },
467
+ "50310": {
468
+ "content": "[unused25]",
469
+ "lstrip": false,
470
+ "normalized": true,
471
+ "rstrip": false,
472
+ "single_word": false,
473
+ "special": false
474
+ },
475
+ "50311": {
476
+ "content": "[unused26]",
477
+ "lstrip": false,
478
+ "normalized": true,
479
+ "rstrip": false,
480
+ "single_word": false,
481
+ "special": false
482
+ },
483
+ "50312": {
484
+ "content": "[unused27]",
485
+ "lstrip": false,
486
+ "normalized": true,
487
+ "rstrip": false,
488
+ "single_word": false,
489
+ "special": false
490
+ },
491
+ "50313": {
492
+ "content": "[unused28]",
493
+ "lstrip": false,
494
+ "normalized": true,
495
+ "rstrip": false,
496
+ "single_word": false,
497
+ "special": false
498
+ },
499
+ "50314": {
500
+ "content": "[unused29]",
501
+ "lstrip": false,
502
+ "normalized": true,
503
+ "rstrip": false,
504
+ "single_word": false,
505
+ "special": false
506
+ },
507
+ "50315": {
508
+ "content": "[unused30]",
509
+ "lstrip": false,
510
+ "normalized": true,
511
+ "rstrip": false,
512
+ "single_word": false,
513
+ "special": false
514
+ },
515
+ "50316": {
516
+ "content": "[unused31]",
517
+ "lstrip": false,
518
+ "normalized": true,
519
+ "rstrip": false,
520
+ "single_word": false,
521
+ "special": false
522
+ },
523
+ "50317": {
524
+ "content": "[unused32]",
525
+ "lstrip": false,
526
+ "normalized": true,
527
+ "rstrip": false,
528
+ "single_word": false,
529
+ "special": false
530
+ },
531
+ "50318": {
532
+ "content": "[unused33]",
533
+ "lstrip": false,
534
+ "normalized": true,
535
+ "rstrip": false,
536
+ "single_word": false,
537
+ "special": false
538
+ },
539
+ "50319": {
540
+ "content": "[unused34]",
541
+ "lstrip": false,
542
+ "normalized": true,
543
+ "rstrip": false,
544
+ "single_word": false,
545
+ "special": false
546
+ },
547
+ "50320": {
548
+ "content": "[unused35]",
549
+ "lstrip": false,
550
+ "normalized": true,
551
+ "rstrip": false,
552
+ "single_word": false,
553
+ "special": false
554
+ },
555
+ "50321": {
556
+ "content": "[unused36]",
557
+ "lstrip": false,
558
+ "normalized": true,
559
+ "rstrip": false,
560
+ "single_word": false,
561
+ "special": false
562
+ },
563
+ "50322": {
564
+ "content": "[unused37]",
565
+ "lstrip": false,
566
+ "normalized": true,
567
+ "rstrip": false,
568
+ "single_word": false,
569
+ "special": false
570
+ },
571
+ "50323": {
572
+ "content": "[unused38]",
573
+ "lstrip": false,
574
+ "normalized": true,
575
+ "rstrip": false,
576
+ "single_word": false,
577
+ "special": false
578
+ },
579
+ "50324": {
580
+ "content": "[unused39]",
581
+ "lstrip": false,
582
+ "normalized": true,
583
+ "rstrip": false,
584
+ "single_word": false,
585
+ "special": false
586
+ },
587
+ "50325": {
588
+ "content": "[unused40]",
589
+ "lstrip": false,
590
+ "normalized": true,
591
+ "rstrip": false,
592
+ "single_word": false,
593
+ "special": false
594
+ },
595
+ "50326": {
596
+ "content": "[unused41]",
597
+ "lstrip": false,
598
+ "normalized": true,
599
+ "rstrip": false,
600
+ "single_word": false,
601
+ "special": false
602
+ },
603
+ "50327": {
604
+ "content": "[unused42]",
605
+ "lstrip": false,
606
+ "normalized": true,
607
+ "rstrip": false,
608
+ "single_word": false,
609
+ "special": false
610
+ },
611
+ "50328": {
612
+ "content": "[unused43]",
613
+ "lstrip": false,
614
+ "normalized": true,
615
+ "rstrip": false,
616
+ "single_word": false,
617
+ "special": false
618
+ },
619
+ "50329": {
620
+ "content": "[unused44]",
621
+ "lstrip": false,
622
+ "normalized": true,
623
+ "rstrip": false,
624
+ "single_word": false,
625
+ "special": false
626
+ },
627
+ "50330": {
628
+ "content": "[unused45]",
629
+ "lstrip": false,
630
+ "normalized": true,
631
+ "rstrip": false,
632
+ "single_word": false,
633
+ "special": false
634
+ },
635
+ "50331": {
636
+ "content": "[unused46]",
637
+ "lstrip": false,
638
+ "normalized": true,
639
+ "rstrip": false,
640
+ "single_word": false,
641
+ "special": false
642
+ },
643
+ "50332": {
644
+ "content": "[unused47]",
645
+ "lstrip": false,
646
+ "normalized": true,
647
+ "rstrip": false,
648
+ "single_word": false,
649
+ "special": false
650
+ },
651
+ "50333": {
652
+ "content": "[unused48]",
653
+ "lstrip": false,
654
+ "normalized": true,
655
+ "rstrip": false,
656
+ "single_word": false,
657
+ "special": false
658
+ },
659
+ "50334": {
660
+ "content": "[unused49]",
661
+ "lstrip": false,
662
+ "normalized": true,
663
+ "rstrip": false,
664
+ "single_word": false,
665
+ "special": false
666
+ },
667
+ "50335": {
668
+ "content": "[unused50]",
669
+ "lstrip": false,
670
+ "normalized": true,
671
+ "rstrip": false,
672
+ "single_word": false,
673
+ "special": false
674
+ },
675
+ "50336": {
676
+ "content": "[unused51]",
677
+ "lstrip": false,
678
+ "normalized": true,
679
+ "rstrip": false,
680
+ "single_word": false,
681
+ "special": false
682
+ },
683
+ "50337": {
684
+ "content": "[unused52]",
685
+ "lstrip": false,
686
+ "normalized": true,
687
+ "rstrip": false,
688
+ "single_word": false,
689
+ "special": false
690
+ },
691
+ "50338": {
692
+ "content": "[unused53]",
693
+ "lstrip": false,
694
+ "normalized": true,
695
+ "rstrip": false,
696
+ "single_word": false,
697
+ "special": false
698
+ },
699
+ "50339": {
700
+ "content": "[unused54]",
701
+ "lstrip": false,
702
+ "normalized": true,
703
+ "rstrip": false,
704
+ "single_word": false,
705
+ "special": false
706
+ },
707
+ "50340": {
708
+ "content": "[unused55]",
709
+ "lstrip": false,
710
+ "normalized": true,
711
+ "rstrip": false,
712
+ "single_word": false,
713
+ "special": false
714
+ },
715
+ "50341": {
716
+ "content": "[unused56]",
717
+ "lstrip": false,
718
+ "normalized": true,
719
+ "rstrip": false,
720
+ "single_word": false,
721
+ "special": false
722
+ },
723
+ "50342": {
724
+ "content": "[unused57]",
725
+ "lstrip": false,
726
+ "normalized": true,
727
+ "rstrip": false,
728
+ "single_word": false,
729
+ "special": false
730
+ },
731
+ "50343": {
732
+ "content": "[unused58]",
733
+ "lstrip": false,
734
+ "normalized": true,
735
+ "rstrip": false,
736
+ "single_word": false,
737
+ "special": false
738
+ },
739
+ "50344": {
740
+ "content": "[unused59]",
741
+ "lstrip": false,
742
+ "normalized": true,
743
+ "rstrip": false,
744
+ "single_word": false,
745
+ "special": false
746
+ },
747
+ "50345": {
748
+ "content": "[unused60]",
749
+ "lstrip": false,
750
+ "normalized": true,
751
+ "rstrip": false,
752
+ "single_word": false,
753
+ "special": false
754
+ },
755
+ "50346": {
756
+ "content": "[unused61]",
757
+ "lstrip": false,
758
+ "normalized": true,
759
+ "rstrip": false,
760
+ "single_word": false,
761
+ "special": false
762
+ },
763
+ "50347": {
764
+ "content": "[unused62]",
765
+ "lstrip": false,
766
+ "normalized": true,
767
+ "rstrip": false,
768
+ "single_word": false,
769
+ "special": false
770
+ },
771
+ "50348": {
772
+ "content": "[unused63]",
773
+ "lstrip": false,
774
+ "normalized": true,
775
+ "rstrip": false,
776
+ "single_word": false,
777
+ "special": false
778
+ },
779
+ "50349": {
780
+ "content": "[unused64]",
781
+ "lstrip": false,
782
+ "normalized": true,
783
+ "rstrip": false,
784
+ "single_word": false,
785
+ "special": false
786
+ },
787
+ "50350": {
788
+ "content": "[unused65]",
789
+ "lstrip": false,
790
+ "normalized": true,
791
+ "rstrip": false,
792
+ "single_word": false,
793
+ "special": false
794
+ },
795
+ "50351": {
796
+ "content": "[unused66]",
797
+ "lstrip": false,
798
+ "normalized": true,
799
+ "rstrip": false,
800
+ "single_word": false,
801
+ "special": false
802
+ },
803
+ "50352": {
804
+ "content": "[unused67]",
805
+ "lstrip": false,
806
+ "normalized": true,
807
+ "rstrip": false,
808
+ "single_word": false,
809
+ "special": false
810
+ },
811
+ "50353": {
812
+ "content": "[unused68]",
813
+ "lstrip": false,
814
+ "normalized": true,
815
+ "rstrip": false,
816
+ "single_word": false,
817
+ "special": false
818
+ },
819
+ "50354": {
820
+ "content": "[unused69]",
821
+ "lstrip": false,
822
+ "normalized": true,
823
+ "rstrip": false,
824
+ "single_word": false,
825
+ "special": false
826
+ },
827
+ "50355": {
828
+ "content": "[unused70]",
829
+ "lstrip": false,
830
+ "normalized": true,
831
+ "rstrip": false,
832
+ "single_word": false,
833
+ "special": false
834
+ },
835
+ "50356": {
836
+ "content": "[unused71]",
837
+ "lstrip": false,
838
+ "normalized": true,
839
+ "rstrip": false,
840
+ "single_word": false,
841
+ "special": false
842
+ },
843
+ "50357": {
844
+ "content": "[unused72]",
845
+ "lstrip": false,
846
+ "normalized": true,
847
+ "rstrip": false,
848
+ "single_word": false,
849
+ "special": false
850
+ },
851
+ "50358": {
852
+ "content": "[unused73]",
853
+ "lstrip": false,
854
+ "normalized": true,
855
+ "rstrip": false,
856
+ "single_word": false,
857
+ "special": false
858
+ },
859
+ "50359": {
860
+ "content": "[unused74]",
861
+ "lstrip": false,
862
+ "normalized": true,
863
+ "rstrip": false,
864
+ "single_word": false,
865
+ "special": false
866
+ },
867
+ "50360": {
868
+ "content": "[unused75]",
869
+ "lstrip": false,
870
+ "normalized": true,
871
+ "rstrip": false,
872
+ "single_word": false,
873
+ "special": false
874
+ },
875
+ "50361": {
876
+ "content": "[unused76]",
877
+ "lstrip": false,
878
+ "normalized": true,
879
+ "rstrip": false,
880
+ "single_word": false,
881
+ "special": false
882
+ },
883
+ "50362": {
884
+ "content": "[unused77]",
885
+ "lstrip": false,
886
+ "normalized": true,
887
+ "rstrip": false,
888
+ "single_word": false,
889
+ "special": false
890
+ },
891
+ "50363": {
892
+ "content": "[unused78]",
893
+ "lstrip": false,
894
+ "normalized": true,
895
+ "rstrip": false,
896
+ "single_word": false,
897
+ "special": false
898
+ },
899
+ "50364": {
900
+ "content": "[unused79]",
901
+ "lstrip": false,
902
+ "normalized": true,
903
+ "rstrip": false,
904
+ "single_word": false,
905
+ "special": false
906
+ },
907
+ "50365": {
908
+ "content": "[unused80]",
909
+ "lstrip": false,
910
+ "normalized": true,
911
+ "rstrip": false,
912
+ "single_word": false,
913
+ "special": false
914
+ },
915
+ "50366": {
916
+ "content": "[unused81]",
917
+ "lstrip": false,
918
+ "normalized": true,
919
+ "rstrip": false,
920
+ "single_word": false,
921
+ "special": false
922
+ },
923
+ "50367": {
924
+ "content": "[unused82]",
925
+ "lstrip": false,
926
+ "normalized": true,
927
+ "rstrip": false,
928
+ "single_word": false,
929
+ "special": false
930
+ }
931
+ },
932
+ "clean_up_tokenization_spaces": true,
933
+ "cls_token": "[CLS]",
934
+ "extra_special_tokens": {},
935
+ "mask_token": "[MASK]",
936
+ "model_input_names": [
937
+ "input_ids",
938
+ "attention_mask"
939
+ ],
940
+ "model_max_length": 8192,
941
+ "pad_token": "[PAD]",
942
+ "sep_token": "[SEP]",
943
+ "tokenizer_class": "PreTrainedTokenizerFast",
944
+ "unk_token": "[UNK]"
945
+ }