Parquet data dictionary for customer hosting
Overview
In this section we dive into the actual setup for data delivery into your cloud storage system. We use Parquet as our intermediate data format to pass from Analytics studio to the BI platform of your choice. We choose Parquet as it is a columnar data format that is efficient, scalable and supports analytics. It stores data organized by column which makes it suitable for analytics systems. It supports compression and encoding techniques to reduce the file size. It supports schema evolution, making it suitable as an intermediate data format that can be passed into different analytics platforms.
Table & Schema Design
It is often the case that customers want to view certain metrics or a subset of fields and not the list of observations. Our parquet table schema allows for this 'cherry pick' selection of fields based on the unique use-case of our customers analytics needs.
In the sections that follow, we will go over the purpose of each table and how their fields link with each other as well as their real world meaning.
Tables
- Content
Call level summary - Content Metric Transposed
KPIs and other 40+ paralinguistic metrics about the conversation itself - Custom Content
Custom content and metrics derived from media metadata non-paralingusitic and factual such as agent name, phone number, etc - Ontology
Flattens the multi-level ONTOLOGY - this is used in tableau joins. - Segment
Turn-level information about what was spoken and how it was spoken - Category Observation
Information of when a category mapping was detected within a call/message interaction - Verb Noun Observation
Verb-Noun pairs extracted from an interaction. Great for quick summary analysis. - Metric Observation
Information about measured conversation metrics. - Sentiment Observation
Information about calculated sentiment scores on turn-level data (stereo). If no sentiment was calculated there should be no entry. If mono calls, this table will not be as useful. - Entity Observation
Information about extracted entities - Keyword Observation
Information about extracted keywords discovered with semantic indexing. This data is not as useful for contact center calls. - Topic Observation
Information about extracted topics discovered with semantic indexing. This data is not as useful for contact center calls. - Conversation Survey
Information about all the surveys like questions, answers that have taken place. - Deleted Content
Content Marked for deletion. - Engagement Agent Participants
Information about all the agent related to engagements - Engagement Campaign Info
Information about all the campaign of the engagements. - Engagement Customer Info
Information about all the customers that have engaged. - Engagement Info
Information about all the Conversations that have happened - Intent Observation
Information about something that get the system observed happened during a conversation - Skill Transfers
Information about skill based transfers of conversations - Classifier Observation
Information about classifier results. If no classifier was used there should be no entry. - Engagement Attachments
Information about files and other attachments uploaded or linked in the interaction. - Engagement Purchase
Information about purchases that took place as part of the interaction. - Engagement Purchase Item
Information about the purchase items themselves in conjunction to the purchase data. - Detector Observation
Information about detectors that fired and their scores. If no detector fired there should be no entry and hence no parquet file either. - Agent Surveys
Agent Survey as filled in by the agent either during or after the conversation has ended, and lets the agent record things like the nature or outcome of the conversation. - Agent Survey Questions
Information about the questions marked against a survey. - Agent Survey Answers
Information about the answer provided within a survey. - Agent Status Event
Information about change in agent's status ie. Online, offline, away and so on.
Content
This is the most foundational table based on which almost all other custom metrics and fields are generated. This table contains the call level summary about the media file itself. The media file being the voice or text conversation that took place between the agent/bot and user.
Table Name | Primary Key | Join Keys |
Content | Content ID | Content ID |
FIELD NAME | DESCRIPTION | DATA TYPE & LIMIT |
content_id | Uniqie ID per call for EA product | DECIMAL(19,0) |
media_id | VoiceBase mediaId - Unique id per call for VB API | CHAR(36) |
account_id | VoiceBase (platform) account ID | CHAR(36) |
account_name | Account name, whatever we use now, probably platform account name | VARCHAR(128) |
title | media title | VARCHAR(255) |
external_id | external media ID | VARCHAR(255) |
language | Overall language of the interaction as pre-selected by client/customer. | VARCHAR(5) |
start_datetime | Metadata from Customer - Call start date/time | DECIMAL(19,0) |
end_datetime | end time we have now but in UNIX time (seconds since 1970/01/01) | DECIMAL(19,0) |
duration_milli_sec | duration of audio file in milliseconds | DECIMAL(19,0) |
redacted | A boolean field, it signals if any information at all was redacted from the interaction. EG: False if no informatio was deleted from the voice/message interaction. This comes in handy when you would simply and quickly want to know in summary how many calls had any senstive data mentioned. Instead of having to look through all instances within the audio/message you can simply use this flag. | BOOLEAN |
interaction_type | Speech if voice, text if messaging. | VARCHAR(32) |
num_audio_channels | Count the distinct number of non-blank/empty w values in the words array where m equals TURN. | DECIMAL(2,0) |
upload_version | version from call details | VARCHAR(64) |
media_content_type | content type of media file | VARCHAR(48) |
transcript_status | Final status of VoiceBase transcription processing; finished, failed. | VARCHAR(32) |
num_words | new field, number of words in the transcript (count all entries in the word array of type “w”) | INTEGER |
metadata_json | Full JSON results | VARCHAR(8192) |
created_datetime | UNIX timestamp when audio was received by VoiceBase | DECIMAL(19,0) |
finished_datetime | UNIX timestamp when the VoiceBase system finished processing the audio | DECIMAL(19,0) |
processed_datetime | UNIX timestamp when export files were created | DECIMAL(19,0) |
publisher_vendor | Source of data (MIA / EH API) which API was used to retrive this data. For voice calls it is always null which is the default. In the near future it will include VoiceCore which is an additional engine. | VARCHAR(96) |
source_system | Overall source of the data. I.E. either VoiceBase or LivePerson. Not particularly useful to customer analytics. | VARCHAR(96) |
site_id | Unique numeric ID associated with LivePerson/Analytics Studio account. | VARCHAR(32) |
Content Metric Transposed
Conversation metrics, KPIs and other 40+ bits of insights related to the interaction from the content observation table. You may think of this table as holding insights and key parameters about the call/messaging interaction itself. This includes (but not limited to) overall sentiment of the call, duration and instances of silence, amplitude and volume of sound within the call and so on.
Table Name | Primary Key | Join Keys |
Content Metric Transposed | Content ID | Content ID |
FIELD NAME | DESCRIPTION | DATA TYPE & LIMIT |
content_id | Uniqie ID per call for EA product | DECIMAL(19,0) |
media_id | VoiceBase mediaId - Unique id per call for VB API | CHAR(36) |
account_id | VoiceBase (platform) account ID | CHAR(36) |
account_name | Account name, whatever we use now, probably platform account name | VARCHAR(128) |
title | media title | VARCHAR(255) |
external_id | external media ID | VARCHAR(255) |
language | Overall language of the interaction as pre-selected by client/customer. | VARCHAR(5) |
start_datetime | Metadata from Customer - Call start date/time | DECIMAL(19,0) |
end_datetime | end time we have now but in UNIX time (seconds since 1970/01/01) | DECIMAL(19,0) |
duration_milli_sec | duration of audio file in milliseconds | DECIMAL(19,0) |
redacted | A boolean field, it signals if any information at all was redacted from the interaction. EG: False if no informatio was deleted from the voice/message interaction. This comes in handy when you would simply and quickly want to know in summary how many calls had any senstive data mentioned. Instead of having to look through all instances within the audio/message you can simply use this flag. | BOOLEAN |
interaction_type | Speech if voice, text if messaging. | VARCHAR(32) |
num_audio_channels | Count the distinct number of non-blank/empty w values in the words array where m equals TURN. | DECIMAL(2,0) |
upload_version | version from call details | VARCHAR(64) |
media_content_type | content type of media file | VARCHAR(48) |
transcript_status | Final status of VoiceBase transcription processing; finished, failed. | VARCHAR(32) |
num_words | new field, number of words in the transcript (count all entries in the word array of type “w”) | INTEGER |
metadata_json | Full JSON results | VARCHAR(8192) |
created_datetime | UNIX timestamp when audio was received by VoiceBase | DECIMAL(19,0) |
finished_datetime | UNIX timestamp when the VoiceBase system finished processing the audio | DECIMAL(19,0) |
processed_datetime | UNIX timestamp when export files were created | DECIMAL(19,0) |
publisher_vendor | Source of data (MIA / EH API) which API was used to retrive this data. For voice calls it is always null which is the default. In the near future it will include VoiceCore which is an additional engine. | VARCHAR(96) |
source_system | Overall source of the data. I.E. either VoiceBase or LivePerson. Not particularly useful to customer analytics. | VARCHAR(96) |
site_id | Unique numeric ID associated with LivePerson/Analytics Studio account. | VARCHAR(32) |
Custom Content
This table contains custom data derived from media metadata that is non-paralinguistic (unlike Content Metric Transposed) in nature. It contains factual meta-data about the call such as the name of the agent on call, their phone number, timezone, email, the name and phone number of their superiors and so on.
Table Name | Primary Key | Join Keys |
Custom Content | Content ID |
FIELD NAME | DESCRIPTION | DATA TYPE & LIMIT |
content_id | Uniqie ID per call for Analytics Studio | DECIMAL(19,0) |
time_zone | This is a custom field open to any data input from the customer side. | VARCHAR(48) |
source_id | This is a custom field open to any data input from the customer side. | VARCHAR(64) |
agent_ext_id | This is a custom field open to any data input from the customer side. | VARCHAR(64) |
agent_full_name | This is a custom field open to any data input from the customer side. | VARCHAR(64) |
agent_phone_number | This is a custom field open to any data input from the customer side. | VARCHAR(64) |
agent_email | This is a custom field open to any data input from the customer side. | VARCHAR(64) |
agent_level | This is a custom field open to any data input from the customer side. | VARCHAR(64) |
agent_location | This is a custom field open to any data input from the customer side. | VARCHAR(64) |
agent_score1 | This is a custom field open to any data input from the customer side. | VARCHAR(64) |
agent_score2 | This is a custom field open to any data input from the customer side. | VARCHAR(64) |
agent_team_ext_id | This is a custom field open to any data input from the customer side. | VARCHAR(64) |
agent_team_label | This is a custom field open to any data input from the customer side. | VARCHAR(64) |
agent_tenure | This is a custom field open to any data input from the customer side. | VARCHAR(64) |
agent_tz | This is a custom field open to any data input from the customer side. | VARCHAR(64) |
agent_manager_ext_id | This is a custom field open to any data input from the customer side. | VARCHAR(64) |
agent_manager_full_name | This is a custom field open to any data input from the customer side. | VARCHAR(64) |
agent_manager_phone_number | This is a custom field open to any data input from the customer side. | VARCHAR(64) |
agent_manager_email | This is a custom field open to any data input from the customer side. | VARCHAR(64) |
agent_supervisor_ext_id | This is a custom field open to any data input from the customer side. | VARCHAR(64) |
agent_supervisor_full_name | This is a custom field open to any data input from the customer side. | VARCHAR(64) |
agent_supervisor_phone_number | This is a custom field open to any data input from the customer side. | VARCHAR(64) |
agent_supervisor_email | This is a custom field open to any data input from the customer side. | VARCHAR(64) |
agent_coach_ext_id | This is a custom field open to any data input from the customer side. | VARCHAR(64) |
agent_coach_name | This is a custom field open to any data input from the customer side. | VARCHAR(64) |
agent_org_ext_id | This is a custom field open to any data input from the customer side. | VARCHAR(64) |
agent_org_name | This is a custom field open to any data input from the customer side. | VARCHAR(64) |
person_ext_id | This is a custom field open to any data input from the customer side. | VARCHAR(64) |
person_full_name | This is a custom field open to any data input from the customer side. | VARCHAR(64) |
person_phone_number | This is a custom field open to any data input from the customer side. | VARCHAR(64) |
person_email | This is a custom field open to any data input from the customer side. | VARCHAR(64) |
person_affiliation_dt | This is a custom field open to any data input from the customer side. | BIGINT |
person_area_code | This is a custom field open to any data input from the customer side. | VARCHAR(64) |
person_city | This is a custom field open to any data input from the customer side. | VARCHAR(64) |
person_country | This is a custom field open to any data input from the customer side. | VARCHAR(64) |
person_postal_code | This is a custom field open to any data input from the customer side. | VARCHAR(64) |
person_state | This is a custom field open to any data input from the customer side. | VARCHAR(64) |
person_tz | This is a custom field open to any data input from the customer side. | VARCHAR(64) |
affiliated_person_name1 | This is a custom field open to any data input from the customer side. | VARCHAR(64) |
affiliated_person_name2 | This is a custom field open to any data input from the customer side. | VARCHAR(64) |
affiliated_person_role1 | This is a custom field open to any data input from the customer side. | VARCHAR(64) |
affiliated_person_role2 | This is a custom field open to any data input from the customer side. | VARCHAR(64) |
after_contact_work_duration | This is a custom field open to any data input from the customer side. | INTEGER |
after_contact_work_end_dt | This is a custom field open to any data input from the customer side. | BIGINT |
after_contact_work_start_dt | This is a custom field open to any data input from the customer side. | BIGINT |
call_direction | This is a custom field open to any data input from the customer side particular to their unique use case. In this case: inbound or outbound (agnet called or customer call) sales call / query call | VARCHAR(32) |
call_type | This is a custom field open to any data input from the customer side particular to their unique use case. In this case: support call / sales call / etc | VARCHAR(48) |
fcr | This is a custom field open to any data input from the customer side particular to their unique use case. In this case: whether or not the customer's issue was resovled in the first call itself. | BOOLEAN |
caller_location | This is a custom field open to any data input from the customer side. | VARCHAR(64) |
campaign_name | This is a custom field open to any data input from the customer side. | VARCHAR(64) |
campaign_id | This is a custom field open to any data input from the customer side. | VARCHAR(64) |
codec_change | This is a custom field open to any data input from the customer side. | VARCHAR(64) |
dialed_extension | This is a custom field open to any data input from the customer side. | VARCHAR(64) |
dtmf_events | This is a custom field open to any data input from the customer side. | VARCHAR(64) |
hold_events | This is a custom field open to any data input from the customer side. | VARCHAR(64) |
hold_time | This is a custom field open to any data input from the customer side. | INTEGER |
outcome_code1 | This is a custom field open to any data input from the customer side. | VARCHAR(64) |
outcome_code2 | This is a custom field open to any data input from the customer side. | VARCHAR(64) |
outcome_code3 | This is a custom field open to any data input from the customer side. | VARCHAR(64) |
incident_id | This is a custom field open to any data input from the customer side. | VARCHAR(64) |
line_type | This is a custom field open to any data input from the customer side. | VARCHAR(64) |
mac_address | This is a custom field open to any data input from the customer side. In this case it is an instance of a mac address. | VARCHAR(64) |
num_evals | This is a custom field open to any data input from the customer side. | INTEGER |
num_holds | This is a custom field open to any data input from the customer side. In this case it is an instance of number of holds while on the call | DECIMAL(3,0) |
queue_duration | This is a custom field open to any data input from the customer side. In this case it is an instance of total wait time / queue duration | INTEGER |
rx_codec | This is a custom field open to any data input from the customer side. In this case it is an instance of a codec they would like to have mentioned. | VARCHAR(64) |
reason_code1 | This is a custom field open to any data input from the customer side. In this case it is an instance of a routing code (as seen in dtmf systems) they would like to have mentioned. | VARCHAR(64) |
reason_code2 | This is a custom field open to any data input from the customer side. In this case it is an instance of a routing code (as seen in dtmf systems) they would like to have mentioned. | VARCHAR(64) |
reason_code3 | This is a custom field open to any data input from the customer side. In this case it is an instance of a routing code (as seen in dtmf systems) they would like to have mentioned. | VARCHAR(64) |
routing_code1 | This is a custom field open to any data input from the customer side. In this case it is an instance of a routing code (as seen in dtmf systems) they would like to have mentioned. | VARCHAR(64) |
routing_code2 | This is a custom field open to any data input from the customer side. In this case it is an instance of a routing code (as seen in dtmf systems) they would like to have mentioned. | VARCHAR(64) |
routing_code3 | This is a custom field open to any data input from the customer side. In this case it is an instance of a routing code (as seen in dtmf systems) they would like to have mentioned. | VARCHAR(64) |
routing_code4 | This is a custom field open to any data input from the customer side. In this case it is an instance of a routing code (as seen in dtmf systems) they would like to have mentioned. | VARCHAR(64) |
routing_code5 | This is a custom field open to any data input from the customer side. In this case it is an instance of a routing code (as seen in dtmf systems) they would like to have mentioned. | VARCHAR(64) |
routing_code6 | This is a custom field open to any data input from the customer side. In this case it is an instance of a routing code (as seen in dtmf systems) they would like to have mentioned. | VARCHAR(64) |
routing_code7 | This is a custom field open to any data input from the customer side. In this case it is an instance of a routing code (as seen in dtmf systems) they would like to have mentioned. | VARCHAR(64) |
routing_code8 | This is a custom field open to any data input from the customer side. In this case it is an instance of a routing code (as seen in dtmf systems) they would like to have mentioned. | VARCHAR(64) |
survey_code1 | This is a custom field open to any data input from the customer side. In this case it is an instance of a survey or campaign code they would like to mention | VARCHAR(64) |
survey_code2 | This is a custom field open to any data input from the customer side. In this case it is an instance of a survey or campaign code they would like to mention | VARCHAR(64) |
survey_code3 | This is a custom field open to any data input from the customer side. In this case it is an instance of a survey or campaign code they would like to mention | VARCHAR(64) |
date_time1 | This is a custom field open to any data input from the customer side. In this case it is one instance of a datetime figure a customer would like to mention. | BIGINT |
date_time2 | This is a custom field open to any data input from the customer side. In this case it is one instance of a datetime figure a customer would like to mention. | BIGINT |
date_time3 | This is a custom field open to any data input from the customer. In this case it is one instance of a numeric figure a customer would like to mention. | BIGINT |
numeric_4_0 | This is a custom field open to any data input from the customer. In this case it is an instance of any numeric figure the customer would like to mention. | NUMERIC(4,0) |
numeric_10_0 | This is a custom field open to any data input from the customer. In this case it is an instance of any numeric figure the customer would like to mention. | NUMERIC(10,0) |
numeric_18_0 | This is a custom field open to any data input from the customer. In this case it is an instance of any numeric figure the customer would like to mention. | NUMERIC(18,0) |
numeric_4_3 | This is a custom field open to any data input from the customer. In this case it is an instance of any numeric figure the customer would like to mention. | NUMERIC(4,3) |
numeric_10_2 | This is a custom field open to any data input from the customer. In this case it is an instance of any numeric figure the customer would like to mention. | NUMERIC(10,2) |
numeric_18_8 | This is a custom field open to any data input from the customer. In this case it is an instance of any numeric figure the customer would like to mention. | NUMERIC(18,8) |
numeric_double | This is a custom field open to any data input from the customer. In this case it is an instance of any numeric figure the customer would like to mention. | DOUBLE |
Segment
Detailed summary insights of various sections within a messaging or voice interaction. For example, if certain words were mentioned within a call, this table mentions the particular section (start time, end time / message count) where said word was mentioned.
TABLE_NAME | PRIMARY KEY | JOIN KEYS |
Segment | segment_id | segment_id, content_id |
FIELD NAME | DESCRIPTION | DATA TYPE & LIMIT |
segment_id | Should use a hash (MurrmurHash), possibly of mediaId+seq_num | DECIMAL(19,0) |
content_id | Unique id of media file | DECIMAL(19,0) |
segment_type | Turn level. Here a turn can be defined as all text sent by an agent until one is sent by the customer or vice versa. | VARCHAR(24) |
seq_num | Sequencing number of segment order through call | SMALLINT |
start_ms | Start milliseconds of segment | DECIMAL(19,0) |
end_ms | End milliseconds of segment | DECIMAL(19,0) |
duration_ms | Duration in milliseconds of segment | DECIMAL(19,0) |
start_ratio | START_MS/Total length of the call | DECIMAL(5,3) |
avg_word_confidence | Transcription confidence score within segment | DECIMAL(4, 3) |
volume_avg | Average volume of segment | DECIMAL(10, 3) |
volume_max | max volume within a segment | DECIMAL(10, 3) |
volume_stdev | standard deviation of volume within a segment | DECIMAL(10, 3) |
num_words | number of words within a segment | SMALLINT |
num_chars | number of characters within a segment | INTEGER |
conversation_sentiment | Sentiment value for this segment | DECIMAL(4, 3) |
speaker_name | Name of speaker if provided | VARCHAR(48) |
conversation_text | verbatim text of segment | VARCHAR(2000000) |
Ontology
This table flattens the multi-level hierarchical grouping of categories into a single string for easier manipulation and plotting in BI platforms or general analytics. Think of it as a category of categories to help group data in a more summarised manner.
TABLE_NAME | PRIMARY KEY | JOIN KEYS |
Ontology | ontology_id | ontology_id |
FIELD NAME | DESCRIPTION | DATA TYPE & LIMIT |
ontology_id | Unique idenntifier for all ontologies | BIGINT |
ontology_label | Name of ontology (e.g. category name) | VARCHAR(128) |
observation_type | Type of observation (e.g. 'CATEGORY', 'DETECTOR',...) | VARCHAR(48) |
collection_name | Highest level of categories. Collection_Group_Label ex: Call Driver, Agent Quality, Alerts | VARCHAR(128) |
group_name | Sub groups of category organization ex: CallDriver_AccessIssue_CantLogin, Group=Access Issue | VARCHAR(64) |
deleted | indicates whether an ontology was deleted from the VoiceBase Platform system. Boolean. | BOOLEAN |
VerbNoun Observation
This table contains Verb-Noun pairs extracted from a conversation. These are great for quick, intuitive summary analysis of the purpose or outcome of a message or voice interaction.
Eg: Verb:Pay, Noun:Bill -- This intuitively tells us that the interaction has something to do with the user wanting to money and or paying bills.
Table Name | Primary Key | Join Keys |
VerbNoun Observation | Obvesrvation ID | Content_Id, Segment_id |
FIELD NAME | DESCRIPTION | DATA TYPE & LIMIT |
observation_id | Generated reproducible hash code, MurmurHash(mediaId+verb+”/” noun) | DECIMAL(19,0) |
observation_type | “VERBNOUN” | VARCHAR(48) |
content_id | Uniqie ID per call for EA product | DECIMAL(19,0) |
segment_id | Reference to segment if there is one | DECIMAL(19,0) |
verb | This is the verb extracted from the transcript. ex "pay" | VARCHAR(64 |
noun | This is the noun extracted from the transcript, ex: "bill" | VARCHAR(64) |
question | Indicates if verb/noun pair was used in a question versus a statement | BOOLEAN |
speaker_name | Name of speaker if provided | VARCHAR(48) |
start_ms | Milliseconds since beginning of the call this observation was detected | DECIMAL(19,0) |
end_ms | Milliseconds since beginning of the call this observation ended | DECIMAL(19,0) |
start_ratio | START_MS/Total length of the call | DECIMAL(5,3) |
verb_negative | A boolean value to describe the context in which the verb is used. For example, TRUE for : I dont want to close my account. | VARCHAR(64) |
noun_negative | A boolean value to describe the context in which the Noun is used. For example, TRUE for : No worries. | VARCHAR(64) |
Category Observation
Information for when a calculated category was observed within an interaction, both matching and non-matching. Categories that do not match a turn (either one person talking in a dialogue) have no segment of the conversation allocated to themselves. It is essentially mapping categories to various segments across the entire conversation.
This table joins with Content, Segment and Ontology tables.
Table Name | Primary Key | Join Keys |
Category Observation | Obvesrvation ID | Content_Id, Segment_id, Ontology_Id |
FIELD NAME | DESCRIPTION | DATA TYPE & LIMIT |
OBSERVATION_ID | Generated reproducible hash code,: MurmurHash(mediaId+“Category”+ categoryName) | DECIMAL(19,0) |
OBSERVATION_TYPE | “CATEGORY” | VARCHAR(48) |
content_id | Uniqie ID per call for EA product | DECIMAL(19,0) |
ONTOLOGY_ID | PK of ONTOLOGY | BIGINT |
SEGMENT_ID | Reference to turn segment if there is one else null | DECIMAL(19,0) |
CATEGORY_MATCH | True if category value is 1, false otherwise (1 = Match, 0 = Not a Match, Null = category was not applied) | BOOLEAN |
SPEAKER_NAME | Name of speaker if provided | VARCHAR(48) |
start_ms | Milliseconds since beginning of the call this observation was detected | DECIMAL(19,0) |
end_ms | Milliseconds since beginning of the call this observation ended | DECIMAL(19,0) |
start_ratio | A mesure of the amount of time elapsed from the start of the conversation that the observation was detected | DECIMAL(5,3) |
START_RATIO | START_MS/Total length of the call | DECIMAL(5,3) |
Entity Observation
A mapping of items-- Names, places, people, things, etc extracted from the voice or messaging interaction by our classifier engine.
TABLE_NAME | PRIMARY KEY | JOIN KEYS |
Entity Observation | Observation_id | Content_Id, Ontology_id |
FIELD NAME | DESCRIPTION | DATA TYPE & LIMIT |
observation_id | Uniqie ID per voice or messaging conversation for VoiceBase engine for the observation | DECIMAL(19,0) |
observation_type | A marker for the type of observation. In this case, "Entity" | VARCHAR(48) |
content_id | Uniqie ID per voice or messaging conversation for VoiceBase engine for the content or media | DECIMAL(19,0) |
ontology_id | Primary Key and reference to the Ontology table | BIGINT |
segment_id | Reference to segment if there is one | DECIMAL(19,0) |
entity_type | Type of entity, e.g. "person", "organization", ... | VARCHAR(24) |
speaker_name | Name of speaker if provided | VARCHAR(48) |
start_ms | Milliseconds since beginning of the call this observation was detected | DECIMAL(19,0) |
end_ms | Milliseconds since beginning of the call this observation ended | DECIMAL(19,0) |
start_ratio | START_MS/Total length of the call | DECIMAL(5,3) |
text_value | Actual text of the entity | VARCHAR(128) |
formatted_text_value | Formatted version of the entity | VARCHAR(255) |
person_first_name | If type "person", the detected first name | VARCHAR(64) |
person_middle_name | If type "person", the detected middle name | VARCHAR(64) |
person_last_name | If type "person", the detected last name | VARCHAR(64) |
money_currency | If type "currency", the detected currency | VARCHAR(64) |
money_major_unit | If type "currency", the detected major unit, e.g. for $12.49 this would be 12 | DECIMAL(19,0) |
money_minor_unit | If type "currency", the detected minor unit, e.g. for $12.49 this would be 49 | DECIMAL(5,0) |
Keyword Observation
A listing of keywords extracted by our classier models. We recommend that you use Verb-Noun observations for a more nuanced and insight-rich analytics experience. For general keyword discovery intents, this table works fine.
TABLE_NAME | PRIMARY KEY | JOIN KEYS |
Keyword Obersavation | Observation_id | Content_Id, Segment_id, Topic_id |
FIELD NAME | DESCRIPTION | DATA TYPE & LIMIT |
observation_id | Uniqie ID per voice or messaging conversation for VoiceBase engine for the observation | DECIMAL(19,0) |
observation_type | A marker for the type of observation. In this case, "Keyword" | VARCHAR(48) |
content_id | Uniqie ID per call for EA product | DECIMAL(19,0) |
segment_id | Reference to segment if there is one | DECIMAL(19,0) |
name | Actual Extracted keyword | VARCHAR(128) |
score | Only mandatory if we always calculate score, need to clarify Valid value range: [0,1] | DECIMAL(10,3) |
topic_id | OBSERVATION_ID of topic this (topic) keyword contributed to | DECIMAL(19,0) |
speaker_name | Name of the speaker if provided. Either Agent or Consumer. | VARCHAR(48) |
start_ms | Milliseconds since beginning of the call this observation was detected | DECIMAL(19,0) |
end_ms | Milliseconds since beginning of the call this observation ended | DECIMAL(19,0) |
start_ratio | At what proportion of the call was this keyword first observed. Caclulated by : START_MS/Total length of the call | DECIMAL(5,3) |
Metric Observation
A listing of various conversation metrics and their values captured against the voice or messaging interaction.
TABLE_NAME | PRIMARY KEY | JOIN KEYS |
Metric Observation | Content_id | Content_id |
FIELD NAME | DESCRIPTION | DATA TYPE & LIMIT |
observation_id | Generated reproducible hash code,: MurmurHash(mediaId+“DETECTOR”+ segmentSequence) | DECIMAL(19,0) |
observation_type | “METRIC” | VARCHAR(48) |
content_id | Uniqie ID per call for EA product | DECIMAL(19,0) |
ontology_id | PK of ONTOLOGY | BIGINT |
metric_value | Actual numeric value of the specified metric | DECIMAL(21, 3) |
Sentiment Observation
A segment-wise breakup of the voice or messaging interaction and calculated sentiment score for those turn-level sections. This is expressed as any decimal value between -1 and 1. Where '-1' is bad and '+1' is good.
TABLE_NAME | PRIMARY KEY | JOIN KEYS |
Sentiment Observation | Content_Id, Segment_id | Content_Id, Segment_id |
FIELD NAME | DESCRIPTION | DATA TYPE & LIMIT |
observation_id | Generated reproducible hash code,: MurmurHash(mediaId+“SENTIMENT”+ segmentSequence) | DECIMAL(19,0) |
observation_type | “SENTIMENT” | VARCHAR(48) |
content_id | Uniqie ID per voice or messaging conversation for VoiceBase engine for the content or media | DECIMAL(19,0) |
segment_id | Reference to segment if there is one | DECIMAL(19,0) |
sentiment_value | Expressed as any decimal value between -1 and 1. Where '-1' is bad and '+1' is good. | DECIMAL(4, 3) |
speaker_name | Name of speaker if provided | VARCHAR(48) |
start_ms | Milliseconds since beginning of the call this observation was detected | DECIMAL(19,0) |
end_ms | Milliseconds since beginning of the call this observation ended | DECIMAL(19,0) |
start_ratio | At what proportion of the call was this feature first observed. Caclulated by : START_MS/Total length of the call | DECIMAL(5,3) |
Topic Observation
A table that details the general topic bucket that a voice or messaging conversation may fall into. These are built based on the Keywords extracted from the interaction. As is the case with the keywords table, we recommend using the VerbNoun Observation table for a richer analytics experience. However, topics are still great for a general analytics overview of the conversation.
TABLE_NAME | PRIMARY KEY | JOIN KEYS |
Topic Observation | observation_id | Content_Id, Parent_id |
FIELD NAME | DESCRIPTION | DATA TYPE & LIMIT |
observation_id | Uniqie ID per voice or messaging conversation for VoiceBase engine for the observation | DECIMAL(19,0) |
observation_type | A marker for the type of observation. In this case, "Topic" | VARCHAR(48) |
content_id | Uniqie ID per call for Analytics Studio platfrom based on the media and found in the Content table | DECIMAL(19,0) |
name | Name of the topic. These are often single words or a long text string formed by concatanating a group of keywords. | VARCHAR(128) |
parent_id | OBSERVATION_ID of parent topic, if entry is a sub-topic. Mandatory for sub-topics as this is the only indicator if a topic is indeed a sub-topic. | DECIMAL(19,0) |
Conversation Survey
Information about all surveys that have taken place.
TABLE_NAME | JOIN KEYS | PRIMARY KEY |
Conversation Survey | Content_Id | Survey_id |
FIELD NAME | DESCRIPTION | DATA TYPE & LIMIT |
content_id | Uniqie ID per call for EA product | DECIMAL(19,0) |
survey_id | Uses a hash (MurrmurHash), possibly of mediaId+seq_num | BIGINT |
survey_type | Indicates the survey types e.g. "Satisfaction", "PostSurvey" | VARCHAR(40) |
survey_status | Indicates the survey status e.g. "completed", "skipped", "closed by consumer" | VARCHAR(40) |
question | Indicates the questions asked in the during survey e.g. "How would you rate your overall satisfaction with the service you received?" | VARCHAR(255) |
answer | Indicates the answer given in survey e.g. "skip","5" | VARCHAR(1000) |
question_id | Indicates the question id of the questions asked in survey | VARCHAR(40) |
question_type | Indicates the question type e.g. "fcr" | VARCHAR(40) |
valid_answer | Indicates is answer is valid or not | BOOLEAN |
Deleted Content
Information about media that was deleted from the system
TABLE_NAME | JOIN KEYS | PRIMARY KEY |
Deleted Content | Content_id | Content_id |
FIELD NAME | DESCRIPTION | DATA TYPE & LIMIT |
content_id | Uniqie ID per call for EA product | DECIMAL(19,0) |
media_id | Unique ID of of the deleted media file | VARCHAR |
Engagement Agent Participants
Information about all the agent related to engagements
TABLE_NAME | JOIN KEYS | PRIMARY KEY |
Engagement Agent Participants | Content_Id,Agent_Id | Agent_participant_id |
FIELD NAME | DESCRIPTION | DATA TYPE & LIMIT |
content_id | Uniqie ID per call for EA product | DECIMAL(19,0) |
agent_participant_id | Generated reproducible hash code,: MurmurHash(mediaId +“/”+ agentId + "/" + agentRole) | BIGINT |
agent_id | Specify agent Id | VARCHAR(64) |
agent_full_name | Specify agent Full Name | VARCHAR(64) |
agent_group_name | Specify agent Group Name | VARCHAR(64) |
agent_group_id | Specify agent Group Id | BIGINT |
agent_name | Specify agent Nick Name | VARCHAR(64) |
agent_login_name | Specify agent Login Name | VARCHAR(64) |
agent_type_name | Specify Agent User Type Name e.g. :0 "Human" or "System", "Bot" | VARCHAR(10) |
agent_role | Specify agent role, e.g. "AGENT", "AGENTMANAGER" | VARCHAR(40) |
agent_responses | Count of responses from the agent in wihtin the interation | DECIMAL(4,0) |
agent_permission | Permissions granted to the participating agents | VARCHAR(24) |
seq_num | Order in which agents joined the conversation. This should be 1 for the first agent who joined and increment by 1. | DECIMAL(2,0) |
is_first_human | TRUE for the first human agent to join the conversation, FALSE for all other agents in the conversation | BOOLEAN |
is_last_human | TRUE for the last human agent to join the conversation, FALSE for all other agents in the conversation | BOOLEAN |
is_first_bot | TRUE for the first bot agent who join the conversation, FALSE for all other agents in the conversation | BOOLEAN |
is_last_bot | TRUE for the last bot agent who join the conversation, FALSE for all other agents in the conversation | BOOLEAN |
is_first | TRUE for the first human or bot agent to join the conversation, FALSE for all other agents in the conversation | BOOLEAN |
is_last | TRUE for the last human or bot agent to join the conversation, FALSE for all other agents in the conversation | BOOLEAN |
first_start_ms | Time elapsed since the start of the engagement in milliseconds. | DECIMAL(19,0) |
Engagement Campaign Info
Information about all the campaign of the engagements.
TABLE_NAME | JOIN KEYS | PRIMARY KEY |
Engagement Campaign Info | Campaign_Id,Content_Id | Content_Id |
FIELD NAME | DESCRIPTION | DATA TYPE & LIMIT |
content_id | Unique id of media file | DECIMAL(19,0) |
campaign_id | ID of the campaign from MIA response | VARCHAR(64) |
campaign_engagement_id | ID of the campaign’s engagement from MIA response | VARCHAR(64) |
campaign_engagement_name | Name of the campaignEngagement | VARCHAR(64) |
campaign_name | Name of the campaign. | VARCHAR(64) |
campaign_engagement_application_name | Name of Campaign Engagement’s application. | VARCHAR(64) |
line_of_business | Information about line of business. This is a custom field. | VARCHAR(64) |
Engagement Customer Info
Information about all the customers that have engaged.
TABLE_NAME | JOIN KEYS | PRIMARY KEY |
Engagement Customer Info | Content_Id,Social_Id,Customer_Id | Customer_info_id |
FIELD NAME | DESCRIPTION | DATA TYPE & LIMIT |
content_id | Unique id of media file | DECIMAL(19,0) |
customer_info_id | Generated reproducible hash code,: MurmurHash(mediaId +“/”+ customerId) | BIGINT |
customer_id | CustomerId specified in MIA response | VARCHAR(128) |
company_size | Size of company specified in MIA response | INTEGER |
customer_type | Type of custmer e.g. :- vip | VARCHAR(128) |
store_number | Store # from which customer did purchasing | VARCHAR(40) |
account_name | Name of the customer account | VARCHAR(128) |
role | Role of customer | VARCHAR(40) |
social_id | Social id of customer | VARCHAR(40) |
customer_status | Status of the company e.g. cancelled | VARCHAR(256) |
Engagement Info
Information about all the Conversations that have happened
TABLE_NAME | JOIN KEYS | PRIMARY KEY |
Engagement Info | Content_Id, Visitor_id, Source_id,Conversation_Id | Content_Id |
FIELD NAME | DESCRIPTION | DATA TYPE & LIMIT |
content_id | Uniqie ID per call for EA product | DECIMAL(19,0) |
conversation_id | ID of conversation | VARCHAR(40) |
close_reason | Reason for closing the conversation — by agent / consumer | VARCHAR(64) |
close_reason_description | Additional information regarding the conversation close reason | VARCHAR(255) |
csat_rate | CSAT Score assigned to the conversation. | DECIMAL(4,1) |
device | Type of device from which the conversation was initially opened. | VARCHAR(40) |
duration | Time from when the consumer started the conversation until it ended | BIGINT |
end_time | The dialog end time, readable format | VARCHAR(64) |
first_conversation | Whether it is the consumer’s first conversation. | BOOLEAN |
is_partial | Indicates whether the conversation’s data is partial | BOOLEAN |
latest_skill_id | An array of latest skill IDs, represented as numbers. The latest skill ID is the latest skill which the conversation was assigned under | DECIMAL(19,0) |
latest_skill_name | Most recent skill name that the conversation was assigned to | VARCHAR(255) |
mcs | Meaningful Conversation Score of the conversation. | INTEGER |
operating_system | Information about the operating system of the device | VARCHAR(32) |
start_time | The dialog start time, readable format. | VARCHAR(64) |
full_dialog_status | Information about the status of Full Dialog | VARCHAR(64) |
app_id | The name of the application | VARCHAR(128) |
language | Language of the conversation’s context | VARCHAR(32) |
interaction_context_id | Session ID in LivePerson system | VARCHAR(40) |
time_zone | Visitor’s time zone | VARCHAR(32) |
integration | The integration type: mobile-sdk, web-sdk, brand-sdk | VARCHAR(255) |
integration_version | The version of the integration | VARCHAR(16) |
operating_system_version | The operating system version, and distribution type (if relevant) | VARCHAR(16) |
browser_version | Detailed version info of the user agent (browser or host-application) | VARCHAR(128) |
visitor_id | Information about the Visitor ID | VARCHAR(40) |
app_name | Information about the Engagement’s application name. | VARCHAR(64) |
conversation_start_page | The page’s URL from which the conversation started | VARCHAR(2000) |
conversation_start_page_title | The page’s title from which the conversation started | VARCHAR(256) |
country | Name of country | VARCHAR(40) |
has_purchase | Flag if purchase has done | BOOLEAN |
source_id | Information about the source id | VARCHAR(40) |
reason_transfer_first | The reason property gives you insight into why the conversation was transferred. | VARCHAR(255) |
source_agent_full_name_transfer_first | Information about the Source Agent Full Name of First Transfer | VARCHAR(40) |
source_agent_id_transfer_first | Information about the Source Agent ID of First Transfer | VARCHAR(40) |
source_agent_login_name_transfer_first | Information about the Source Agent Login Name of First Transfer | VARCHAR(40) |
source_agent_nick_name_transfer_first | Information about the Source Agent Nick Name of First Transfer | VARCHAR(40) |
source_skill_id_transfer_first | Information about the Source Skill ID of First Transfer | VARCHAR(40) |
source_skill_name_transfer_first | Information about the Source Skill Name of First Transfer | VARCHAR(40) |
target_skill_id_transfer_first | Information about the Target Skill ID of First Transfer | VARCHAR(40) |
target_skill_name_transfer_first | Information about the Target Skill Name of First Transfer | VARCHAR(40) |
transfer_time_first | Information about the First Transfer TIme | VARCHAR(40) |
reason_transfer_last | Information about the Reason of last Transfer | VARCHAR(255) |
source_agent_full_name_transfer_last | Information about the Source Agent Full Name of last Transfer | VARCHAR(40) |
source_agent_id_transfer_last | Information about the Source Agent ID of Last Transfer | VARCHAR(40) |
source_agent_login_name_transfer_last | Information about the Source Agent Login Name of lastTransfer | VARCHAR(40) |
source_agent_nick_name_transfer_last | Information about the Source Agent Nick Name of last Transfer | VARCHAR(40) |
source_skill_id_transfer_last | Information about the Source Skill ID of last Transfer | VARCHAR(40) |
source_skill_name_transfer_last | Information about the Source Skill Name of lastTransfer | VARCHAR(40) |
target_skill_id_transfer_last | Information about the Target Skill ID of Last Transfer | VARCHAR(40) |
target_skill_name_transfer_last | Information about the Target Skill Name of last Transfer | VARCHAR(40) |
transfer_time_last | Information about the last Transfer TIme | VARCHAR(40) |
total_responses | Max amount of conversations in the response | INTEGER |
latest_agent_name | The agent’s login name | VARCHAR(64) |
latest_agent_id | Most recent agent ID the conversation was assigned to. | VARCHAR(64) |
latest_agent_group | Group name of the agent most recently assigned to the conversation | VARCHAR(64) |
latest_consumer_participant_id | Contains information about the latest consumer(s) participating id in the conversation | VARCHAR(64) |
nps_text | NPS_Text assigned to the conversation | VARCHAR(64) |
nps | NPS assigned to the conversation | DECIMAL(2,0) |
fcr_text | FCR_Text assigned to the conversation. | VARCHAR(64) |
fcr | Values of FCR (First Call Resolution) assigned to the conversation. | DECIMAL(2,0) |
csat_text | CSAT_Text value of the conversation | VARCHAR(64) |
csat | CSAT score of the conversation | DECIMAL(2,0) |
transfer_count | Information about the transfer count | DECIMAL(4,0) |
first_intent_name | The id of the intent that was created based on the first intent-able message in the conversation | VARCHAR(255) |
first_intent_label | The label of the intent that was created based on first intent-able message in the conversation | VARCHAR(255) |
conversation_status | Status of the conversation. EGL | NUMERIC(1,0) |
conversation_end_time | Timestamp of when the conversation ended | TIMESTAMP |
has_human_agent | A boleen value to flag if a human agent was involved in the engagement | BOOLEAN |
has_bot | A boleen value to flag if a bot agent was involved in the engagement | BOOLEAN |
Intent Observation
Information about something that the system observed happened during a conversation.
TABLE_NAME | JOIN KEYS | PRIMARY KEY |
Intent Observation | Content_Id | Observation_id |
FIELD NAME | DESCRIPTION | DATA TYPE & LIMIT |
content_id | Unique id of media file | DECIMAL(19,0) |
observation_id | Generated reproducible hash code,: MurmurHash(segmentSequence + mediaId +“INTENT”+ intentName) | BIGINT |
observation_type | "INTENT" | VARCHAR(6) |
ontology_Id | Primary Key of ONTOLOGY | BIGINT |
name | Contains the intent name, generally its some uuid | VARCHAR(255) |
label | Contains the intent of sentence e.g. "ask about discount or promotion" | VARCHAR(255) |
confidence_score | Intent confidence level value as calculated by the integrated platform. | DECIMAL(8,3) |
primary_intend | Determine whether its first intent in the list. Acertained by picking up the first possible intent the system picks up in a conversation | BOOLEAN |
seq_num | Determines the sequence of the intent in the list | DECIMAL(4,0) |
is_first_defined | Boolean to check if the first intent is indeed exists | BOOLEAN |
meta_intent | It is a group of intents that have a similar nature. For example there can be a "check bill date" intent and a "check bill amount" intent that do two separate things, but are grouped under a meta intent called "Billing". | VARCHAR(255) |
is_defined | Boolean to check if an intent is defined. The definition of defined is: the LABEL is not null and not the string "undefined" (case insensitive). | BOOLEAN |
is_last_defined | Boolean. True, if it is the last defined intent in the list of intents activated against a conversation. | BOOLEAN |
Skill Transfers
Information about skill based transfers of conversations
TABLE_NAME | JOIN KEYS | PRIMARY KEY |
Skill Transfers | content_id, source_skill_id | skill_transfer_id |
FIELD NAME | DESCRIPTION | DATA TYPE & LIMIT |
SKILL_TRANSFER_ID | Unique ID for the transfer. | DECIMAL(19,0) |
content_id | Unique ID for the Conversation | DECIMAL(19,0) |
REASON | What type of transfer was this, for example a Skill transfer to another skill, or an Agent transfer to another agent (for example the conversation is on the correct skill but another agent was handling it, or the current agent is stepping away) | VARCHAR(64) |
SOURCE_SKILL_ID | Unique Id for the skill name that the conversation started on. | DECIMAL(19,0) |
SOURCE_SKILL_NAME | The skill that the conversation was on when the transfer was initiated. | VARCHAR(64) |
SOURCE_AGENT_ID | Unique ID of agent who was in control of the conversation when the transfer was initiated | VARCHAR(64) |
TARGET_SKILL_ID | Unique ID of the skill to which the conversation was transfered. | DECIMAL(19,0) |
TARGET_SKILL_NAME | If it is a skill to skill transfer then this will be the skill that the conversations was transferred to. | VARCHAR(64) |
TARGET_AGENT_ID | Unique ID of the agent to which the conversation was transfered. | VARCHAR(64) |
TRANSFER_TIME_MS | UTC Time stamp at which conversation was transfered in milliseconds Unix Epoc Time | DECIMAL(19,0) |
START_MS | Milliseconds since beginning of the interaction that this transfer was made | DECIMAL(19,0) |
SEQ_NUM | The number of transfers that took place in one conversation, in a sequence starting from 1. | DECIMAL(4,0) |
IS_FIRST | Boolean. True if it is the first transfer in a conversation. | BOOLEAN |
IS_LAST | Boolean. True if it is the last transfer in a conversation. | BOOLEAN |
TRANSFER_TIME | UTC Time stamp at which conversation was transfered in seconds Unix Epoc Time | DECIMAL(19,0) |
Classifier Observation
Information about classifier results. If no classifier was used there should be no entry.
TABLE_NAME | JOIN KEYS | PRIMARY KEY |
Classifier Observation | Content_Id, Ontology_id | Observation_id |
FIELD NAME | DESCRIPTION | DATA TYPE & LIMIT |
observation_id | Generated reproducible hash code. Hash(mediaId+“CLASSIFIER”+ segmentSequence) | DECIMAL(19,0) |
observation_type | Type of observation (e.g. 'CATEGORY', 'DETECTOR', “CLASSIFIER”...). In this case, it would be “CLASSIFIER” | VARCHAR(48) |
content_id | Uniqie ID per call | DECIMAL(19,0) |
ontology_id | PK of ONTOLOGY | BIGINT |
class_name | Custom Name given to classifer model | DECIMAL(10,0), |
class_label | This is the index ID of the calculated classification | VARCHAR(128) |
score | A confidence score against the class assinged. Valid value range [-99.999,+99.999] | DECIMAL(10,3) |
version | Classifier version number | VARCHAR(64), |
Engagement Attachments
Information about files and other attachments uploaded or linked in the interaction.
TABLE_NAME | JOIN KEYS | PRIMARY KEY |
Engagement Attachments | content_id, participant_id | engagement_attachment_id |
FIELD NAME | DESCRIPTION | DATA TYPE & LIMIT |
engagement_attachment_id | Unique id of media file. Also PK for this table | DECIMAL(19,0) |
content_id | Unique id of media file | DECIMAL(19,0) |
type | "Link" or "File" depending on what's uploaded. | VARCHAR(16) |
file_type | File type of the file uploaded | VARCHAR(96) |
caption | Any caption text added aloing with the file upload | VARCHAR(255) |
location | Link or path of the uploaded file | VARCHAR(1024) |
speaker_name | Name of speaker if provided | VARCHAR(64) |
speaker_role | Role of speaker if provided | VARCHAR(64) |
participant_id | Generated reproducible hash code,: MurmurHash(mediaId +“/”+ agentId + "/" + agentRole) | VARCHAR(64) |
Engagement Purchase
Information about purchases that took place as part of the interaction.
TABLE_NAME | JOIN KEYS | PRIMARY KEY |
Engagement Purchase | content_id, order_id | purchase_id |
FIELD NAME | DESCRIPTION | DATA TYPE & LIMIT |
content_id | Uniqie ID per call for EA product | DECIMAL(19,0) |
purchase_id | Generated reproducible hash code, suggested: MurmurHash(mediaId +“/”+ orderId) | BIGINT |
order_id | Extracted from response, Contains order id for each order | VARCHAR(40) |
quantity | Value is always 1 | INTEGER |
total | contais total order value e.g. 78.0 | DECIMAL(10, 2) |
currency | If type "currency", the detected currency value. | VARCHAR(16) |
cart_total | Total value of the customers cart | DECIMAL(10, 2) |
cart_currency | Currency type attached to the customer's cart | VARCHAR(16) |
Engagement Purchase Item
Information about the purchase items themselves in conjunction to the purchase data.
TABLE_NAME | JOIN KEYS | PRIMARY KEY |
Engagement Purchase Item | content_id, purchase_id | purchase_item_id |
FIELD NAME | DESCRIPTION | DATA TYPE & LIMIT |
content_id | Uniqie ID per call for EA product | DECIMAL(19,0) |
purchase_item_id | Unique ID assigned to item purchased | DECIMAL(19,0) |
purchase_id | Unique ID for the purchase itself | BIGINT |
product_name | Name of product purchased if provided | VARCHAR(255) |
product_sku | SKU of product if provided | VARCHAR(40) |
product_category | Open text for category of product if provided | VARCHAR(255) |
product_price | Numeric amount for price of product if provded | DECIMAL(10,2) |
quantity | Quantity of product if provided. | DECIMAL(6,0) |
Detector Observation
Information about detectors that fired and their scores. If no detector fired there should be no entry and hence no parquet file either.
TABLE_NAME | JOIN KEYS | PRIMARY KEY |
Detector Observation | Content_Id, Segment_id, Ontology_Id | observation_id |
FIELD NAME | DESCRIPTION | DATA TYPE & LIMIT |
observation_id | Generated reproducible hash code, suggested: MurmurHash(mediaId+“DETECTOR”+ segmentSequence) | DECIMAL(19,0) |
observation_type | Type of observation (e.g. 'CATEGORY', 'DETECTOR', “CLASSIFIER”...). In this case, it would be “DETECTOR” | VARCHAR(48) |
content_id | Uniqie ID per call for EA product | DECIMAL(19,0) |
ontology_id | PK of ONTOLOGY | BIGINT |
segment_id | Reference to segment if there is one | DECIMAL(19,0) |
class_name | Index number of the classifier | DECIMAL(10,0) |
class_label | Lablel of the classifer if available. EG: "PCI" | VARCHAR(128) |
version | Version number of classifier used | VARCHAR(64) |
speaker_name | Name of speaker if provided | VARCHAR(48) |
start_ms | Milliseconds since beginning of the call this observation was detected | DECIMAL(19,0) |
end_ms | Milliseconds since beginning of the call this observation ended | DECIMAL(19,0) |
start_ratio | START_MS/Total length of the call | DECIMAL(5,3) |
Agent Surveys
Agent Survey as filled in by the agent either during or after the conversation has ended, and lets the agent record things like the nature or outcome of the conversation.
TABLE_NAME | JOIN KEYS | PRIMARY KEY |
Agent Surveys | survey_id, ac_survey_id | agent_survey_id |
FIELD NAME | DESCRIPTION | DATA TYPE & LIMIT |
agent_survey_id | Unique ID for the survey record | DECIMAL(19,0) |
content_id | Unique id for the record across platform | DECIMAL(19,0) |
survey_status | Operational status of the survey. EG: Open, Closed | VARCHAR(40) |
status_reason | Cause for the provided status. EG: If the survey_status is "CLOSED" then the status_reson might be "TIMEOUT". Ie. the survey ended due to inactivity or a set timeout. | VARCHAR(64) |
survey_id | Unique ID per survey | VARCHAR(40) |
ac_survey_id | A unique ID for the survey per LivePerson Platform | VARCHAR(40) |
ac_survey_name | A select name for the survey per LivePerson Platform | VARCHAR(40) |
ac_survey_revision | A unique ID for the survey per LivePerson Platform | DECIMAL(4,0) |
survey_skill_id | Skill ID assigned to the survey | DECIMAL(19,0) |
survey_skill_name | Skill Name assigned to the survey | VARCHAR(64) |
assigned_agent_id | ID of the agent assigned to the survey | VARCHAR(64) |
performed_agent_id | VARCHAR(64) | |
last_update_time | Timestamp of when the last update (Unix Epoc Seconds) | DECIMAL(19,0) |
is_latest_survey | Boolean to check if its the latest survey | BOOLEAN |
Agent Survey Questions
Information about the questions marked against a survey.
TABLE_NAME | JOIN KEYS | PRIMARY KEY |
Agent Survey Questions | content_id, agent_survey_id | agent_survey_question_id |
FIELD NAME | DESCRIPTION | DATA TYPE & LIMIT |
agent_survey_question_id | Unique ID per survey question | DECIMAL(19,0) |
content_id | Uniique ID for VB Plaftorm | DECIMAL(19,0) |
agent_survey_id | Unique ID of the survey to which the question is attached | DECIMAL(19,0) |
question_text | Survey question text | VARCHAR(255) |
question_id | Unique ID per question | VARCHAR(40) |
question_definition | Defiition/ Description of the question | VARCHAR(40) |
question_category | Open text category to which the survey belongs | VARCHAR(40) |
is_conversation_outcome | if TRUE, said question marks the outcome of the survey. | BOOLEAN |
Agent Survey Answers
Information about the answer provided within a survey.
TABLE_NAME | JOIN KEYS | PRIMARY KEY |
Agent Survey Answers | content_id, agent_survey_id | agent_survey_answer_id |
FIELD NAME | DESCRIPTION | DATA TYPE & LIMIT |
agent_survey_answer_id | Unique ID per survey answer | DECIMAL(19,0) |
content_id | Uniique ID for VB Plaftorm | DECIMAL(19,0) |
agent_survey_id | Unique ID of survey to which answer is attached | DECIMAL(19,0) |
answer | Survey answer text | VARCHAR(255) |
answer_id | Unique ID per answer | VARCHAR(40) |
Agent Status Event
Information about change in agent's status ie. Online, offline, away and such.
TABLE_NAME | JOIN KEYS | PRIMARY KEY |
Agent Status Event | agent_status_event_id, agent_id | agent_status_event_id |
FIELD NAME | DESCRIPTION | DATA TYPE & LIMIT |
agent_status_event_id | Unique ID per event | DECIMAL(19,0) |
account_id | Unique ID per VB account | VARCHAR(36) |
agent_id | Unique ID per agent in alpha numeric format | VARCHAR(64) |
agent_id_numeric | Unique ID per agent in strictly numeric format | DECIMAL(19,0) |
status_change_time | Timestamp of status update | TIMESTAMP |
session_id | Unique ID for a given conversation between customer and agent. ie. Identifier of the session during which this status change took place | DECIMAL(19,0) |
sequence_number | Sequence number that prescribes the order of each message / event | INTEGER |
status_type | Type of status change. 1 - status changed, see `statusSubType` 3 - login 4 - logout | SMALLINT |
status_sub_type | Subtype of status change with statusType=1 . 1 - offline 2 - online 3 - occupied 4 - away | SMALLINT |
status_reason_id | Identifier of optional custom reason for the status change | DECIMAL(19,0) |
status_reason_text | Optional custom reason for the status change | VARCHAR(64) |
prev_status_change_time | Time of this agent’s previous status change | TIMESTAMP |
Missing Something?
Check out our Developer Center for more in-depth documentation. Please share your documentation feedback with us using the feedback button. We'd be happy to hear from you.