How do I choose appropriate strictness values when entering a topic phrase?

To determine the optimal strictness for a phrase, start with a default setting, then evaluate the captured events and adjust the strictness accordingly. Experimentation is key to finding the right balance between precision and recall.

Longer phrases with more meaningful words often require less strict matching. For example, a three-word phrase might match effectively with medium strictness (two out of three words), while a five-word phrase might only need three matches (medium-low strictness). The ideal strictness depends on the specific phrase and its intended use.

Below are some examples where adjusting strictness can either help or hinder your results, highlighting the importance of tweaking strictness for the best precision/recall tradeoff.

For more information, see Work with a topic, and Work with a phrase.

Examples where raising the strictness will avoid mistakes (increase precision)Click to expand

Defined program phrase	Detected phrase	Topic matched	Selected strictness	Explanation
Called the other day	Called there the other day	Contacted Previously	Medium	“There” provides context that make this irrelevant to the matched topic. Increasing the strictness would deem this occurrence as ‘not a close enough match’ for Contacted Previously.
I called back	I may call back	Contacted Previously	MedLow	“May” provides context that make this irrelevant to the matched topic. Increasing the strictness would deem this occurrence as ‘not a close enough match’ for Contacted Previously.
You’d like to cancel	Cause it wouldn’t be like cancelling	Cancel Mention	MedHigh	The context of “cancel” is different, therefore making this irrelevant to the matched topic. Increasing the strictness would deem this occurrence as ‘not a close enough match’ for Cancel Mention.

Examples where raising the strictiness will miss good occurrences (decrease recall)Click to expand

Defined program phrase	Detected phrase	Topic matched	Selected strictness	Explanation
I called a while ago	Called a little while ago	Contacted Previously	Med-Low	Contacted Previously was correctly matched, despite the extra “little” in the detected phrase. Increasing the strictness would result in classifying this occurrence as “not a close enough match” for Contacted Previously.
I spoke to another person	Person I spoke to	Contacted Previously	Med-Low	Contacted Previously was correctly matched, despite the missing “another” in the detected phrase. Increasing the strictness would result in classifying this occurrence as “not a close enough match” for Contacted Previously.
Want to cancel	Want to just cancel	Cancel Mention	Med-Low	Cancel Mention was correctly matched, despite the extra “just” in the detected phrase. Increasing the strictness would result in classifying this occurrence as “not a close enough match” for Cancel Mention.
Cancel the order	Cancel the money order	Cancel Mention	Med-Low	Cancel Mention was correctly matched, despite the extra “money” in the detected phrase. Increasing the strictness would result in classifying this occurrence as “not a close enough match” for Cancel Mention.

Voice transcription – What is dictionary management ?

Dictionary management provides a means of improving recognition for business or domain-specific terms. Specific brands, words, or acronyms are transcribed based on the organization’s specifics. This feature allows customers to add terms to the dictionary, enhancing the transcription service’s likelihood of recognition. For more information, see Understand dictionary management.

One way of identifying similar-sounding terms involves observing recognition errors in the transcript. For example, if you consistently notice that “IRS” is transcribed as “eye are es,” you can now add the term “IRS” and include a similar-sounding entry with “eye are es.” See the following table for more examples.

Term	Example phrases	Sounds like
Neurological	He has a neurological condition Neurological degeneration A neurological disease Severe neurological disorder	Neuro logical Euro logical
Healthcare	Qualified healthcare provider	Health care Health car Earth care
Priming	Priming the pump Priming the charge Priming the brain	Prime ing Prim ing Timing Pry ming
IRS	An IRS audit Following IRS direction Reviewed by the IRS Requested by the IRS	Eye are ess Eye Are es I R S Ire rest I RS IR S Aye heiress I heiress
Acme	My favorite brand is Acme Acme brand pancakes Strong loyalty from acme I like acme	Acne Act knee Act me Hack me
Louis Vuitton	My favorite brand is Louis Vuitton I like Louis Vuitton hand bags I’d like to order a Louis Vuitton suitcase	Lui vito Loo e v ton Loo e bitten

Notes:

Dictionary management is not case sensitive. It will not modify the capitalization of terms in the transcript.
In languages that typically lack spaces, such as Japanese, it is important to include spaces in terms to enhance performance.

Dictionary Management does not interfere with Topic Spotting, so users wanting to spot topics in interactions should continue to use the service. The service currently supports native voice transcription dialects. For more information, see Genesys Cloud supported languages.

Voice transcription – How much does Extended Voice Transcription Services cost?

Extended Voice Transcription Services is billed based on per minute usage. For every minute of voice transcription that occurs through Extended Voice Transcription Services, your organization will be billed based on your billed currency.

Notes:

Under Voice transcription (legacy) (GC-170-NV-VTFAIRUSEO) EVTS has no fair use allocation. Billing occurs once EVTS is used.
Under Voice transcription (GC-170-NV-VOICETRANSCRIPTION) EVTS and native has a fair use allocation.
EVTS is available for Genesys Cloud CX1 and Genesys Cloud CX2 Orgs as long as the Genesys Cloud CX1 WEM Add-on II or Genesys Cloud CX2 WEM Add-on I is enabled. When using EVTS, transcribed users will not be billed for Genesys Cloud CX1 WEM Add-on II or Genesys Cloud CX2 WEM Add-on I provided that Topic Spotting is not enabled for those interactions.

USD	CAD	AUD	NZD	GBP	EUR	BRL	JPY	ZAR
0.0100	0.0110	0.0130	0.0140	0.0070	0.0080	0.0400	1.2000	0.1420

For more information, see Genesys Cloud fair use policy and Genesys Cloud pricing and concurrency update.

Voice transcription – How does Extended Voice Transcription Services – Azure provide customer data security?

Extended Voice Transcription Services streams media outside of Genesys Cloud to a third party to generate voice transcripts. Currently, these Extended Voice Transcription Services are provided by Microsoft through their Azure Speech-to-Text offering. As part of this combined offering, Genesys ensures data security in the following ways:

Note: Genesys Cloud is transitioning the Extended Voice Transcription Services engine from Microsoft Azure to AWS Transcribe. Impacted organizations will receive advance notice prior to any changes./bs_well]

Azure Speech-to-Text does not store any audio or transcription data at rest. All data in-transit is encrypted. For more information, see Microsoft Data and Privacy for Speech-to-Text.
The media sent to Azure Speech-to-Text services is processed only in Azure’s server memory and no data is stored at rest by the third party.
Once transcribed, all transcripts are encrypted and safely stored within Genesys Cloud.
All media sent to a third party is encrypted using TLS.
Transcripts created by Extended Voice Transcription and recorded interactions are stored by Genesys Cloud using the same type of encryption.

For more information, see Recording encryption key overview, Understand voice transcripts, and Azure regions for Extended Voice Transcription Services.

Voice transcription – What is the difference between Genesys Cloud Voice Transcription and Extended Voice Transcription Services?

Both Genesys Cloud Voice Transcription and Extended Voice Transcription Services (EVTS) can transcribe voice interactions.

The differences between Genesys Cloud Voice Transcription and Extended Voice Transcription Services (EVTS) are summarized in the following list.

EVTS extends Genesys Cloud’s own native transcription.
EVTS uses third party transcription services and may have different performance attributes.
EVTS can provide access to additional dialects and languages.
EVTS uses a non-customizable transcription model. Customization is only available with Genesys Voice Transcription.
For non-Genesys Cloud CX3 customers (in addition to EVTS charges), the customer will also be billed for WEM Add-on when Topic Spotting is used.

Note: During call segments, WEM voice transcription may use transcripts using Google Dialogflow. For more information, see Agent Assist overview.

For more information about EVTS, see:

Voice transcription – Can I download a voice transcript?

You can export transcripts from one or more interactions using the speech and text analytics API.

Also, a transcript can be copied manually from the Interaction Details page by clicking the Copy Transcript option in the top right corner of the transcript. For more information, see Work with a digital transcript.

For more information, see Speech and text analytics API.

Voice transcription – What is the accuracy of voice transcription and how do I increase it?

A variety of factors can affect transcription accuracy. For more information, see Improving transcription accuracy. Genesys Cloud native voice transcription performs at a similar level of accuracy to other transcription vendors.

After you address all factors that may negatively impact accuracy, you can use dictionary management to improve accuracy.

Dictionary management provides a means of improving recognition for business or domain-specific terms. Specific brands, words, or acronyms are transcribed based on the organization’s specifics. This feature allows customers to add terms to the dictionary, enhancing the transcription service’s likelihood of recognition. For more information, see Understand dictionary management.

Dictionary Management does not interfere with topic spotting. Topic spotting supports native voice transcription dialects. For more information, see Genesys Cloud supported languages.

Perform the following to improve accuracy with topic spotting.

Add the term to the phrase list within a new or existing topic.
Verify the specific topic is added to the topic list of the program used to transcribe the interactions.

Note: Topics recognized due to dictionary management appear in the transcript when viewed in the Interaction Details page.

Transcription accuracy rates can vary significantly within the contact center based on audio quality, clarity of speech and additional training provided through topics.

Accuracy of voice transcription is typically measured by Word Error Rate (WER). WER identifies the number of words that are incorrectly transcribed during voice transcription, and divides this number by the number of words in a manual transcription.

There are three types of errors.

Insertion (I): When words are incorrect added to the transcript.
Deletion (D): When words are not detected within the transcript.
Substitution (S): When words are substituted for irrelevant words.

These are added together and divided by the total number of words from the manual transcription (N).

WER is then calculated with the following equation:

WER ExampleClick to expand

For more information, see Improving transcription accuracy, and Work with a phrase.

Voice transcription – How do I make sure that custom words, product names, and brand names are transcribed correctly?

A variety of factors can affect transcription accuracy. For more information, see Improving transcription accuracy.

After you address all factors that may negatively impact accuracy, you can use dictionary management to improve accuracy.

Dictionary management provides a means of improving recognition for business or domain-specific terms. Specific brands, words, or acronyms are transcribed based on the organization’s specifics. This feature allows customers to add terms to the dictionary, enhancing the transcription service’s likelihood of recognition. For more information, see Understand dictionary management.

Boost values range from 1 to 10 and increase the likelihood of the term’s identification on a logarithmic scale. Boost values are only available with the API.

Dictionary Management does not interfere with topic spotting. Topic spotting supports native voice transcription dialects. For more information, see Genesys Cloud supported languages.

Perform the following to improve accuracy with topic spotting.

Add the term to the phrase list within a new or existing topic.
Verify the specific topic is added to the topic list of the program used to transcribe the interactions.

After applying new terms, example phrases, similar-sounding terms, and boost levels through the API, it may take up to 30 minutes for the improvements to become effective.

Note: Topics recognized due to dictionary management are shown in the transcript when viewed in the Interaction Details page.

For more information, see Improving transcription accuracy, and Work with a phrase.

Voice transcription – Is voice transcription supported using third parties such as Amazon, Google, or Microsoft?

Genesys Cloud uses its own native transcription engine and includes Extended Voice Transcription Services (EVTS) as an alternative to voice transcription. The underlying provider for Extended Voice Transcription Services can be either Microsoft Azure Speech-to-Text, or AWS Transcribe.

EVTS provides customers with additional language support beyond the Genesys Cloud native transcription engine, and a choice between the engines when transcribing voice interactions.

For other voice transcription providers such as Google, you must integrate using existing AudioHook and Transcription connector capabilities.

For more information, see: About AudioHook Monitor, and Voice transcription – What is the difference between Genesys Cloud Voice Transcription and Extended Voice Transcription Services.

Voice transcription – What is the expected latency and level of accuracy for voice transcription?

Within Genesys Cloud, audio is transcribed in near real time, with a typical latency of 3-5 seconds, and is accessible through our Notifications APIs. The full interaction transcript becomes available in the Interaction Details UI immediately after the call, usually within 15 seconds.

For more information, see Genesys Cloud supported languages, How do I increase the accuracy of voice transcription?, Configure voice transcription, and How do I make sure that custom words, product names, and brand names are transcribed correctly?.

FAQs: Voice transcription

How do I choose appropriate strictness values when entering a topic phrase?

Voice transcription – What is dictionary management ?

Voice transcription – How much does Extended Voice Transcription Services cost?

Voice transcription – How does Extended Voice Transcription Services – Azure provide customer data security?

Voice transcription – What is the difference between Genesys Cloud Voice Transcription and Extended Voice Transcription Services?

Voice transcription – Can I download a voice transcript?

Voice transcription – What is the accuracy of voice transcription and how do I increase it?

Voice transcription – How do I make sure that custom words, product names, and brand names are transcribed correctly?

Voice transcription – Is voice transcription supported using third parties such as Amazon, Google, or Microsoft?

Voice transcription – What is the expected latency and level of accuracy for voice transcription?