Setup of customer hosting

    Overview

    Analytics Studio, by LivePerson, is a groundbreaking solution that enables data-driven enterprises to leverage the hidden potential in their omni-channel call recordings and text conversations. This analytics platform seamlessly captures and collates asynchronous messaging, synchronous chats, and speech through LivePerson's Conversational Cloud and VoiceBase platform. With Analytics Studio, businesses can quickly convert conversations into actionable data, providing valuable insights in a matter of minutes.
    Through the Customer Hosted Analytics Studio offering, LivePerson gives you the power to host your own data and to visualise it in the BI platform of your choice. You may also map your LivePerson data to your own custom fields for a more personalised analytics experience. All this and more is made possible through the Customer Hosted Analytics Studio feature which we shall now further explore here.

    Image

    Customer hosted analytics studio overview diagram

    All the data from the screenshots or mentioned elsewhere in this tutorial come from a test/playground dataset called ThriveAir which resembles real world interactions between an airline customer service representative and their customers.

    Setup

    The setup consists of two parts-- Amazon S3 and LivePerson Analytics Studio platform.
    Currently,
    This page focuses on the AWS S3 with Azure blob storage being added soon (Q4 2023).

    AWS S3 Setup

    1. Access & Admin controls
    2. AWS Stack Creation
    3. AWS S3 Bucket Folder Creation
    4. Fetching details from AWS for Analytics Studio
    5. Setup AWS IAM Roles & Policies

    Access & Admin controls

    Make sure you have the right credentials and IAM roles to create and setup S3 buckets and assign roles and permissions. If not, contact your IT Support staff.
    Once cleared, sign into AWS as root user.

    AWS Stack Creation

    This section deals with setting up your AWS account so that it's ready to ingress data from LivePerson Analytics Studio platform.

    1. In your AWS account, Head over to the Cloudformation Console and click on the yellow "Create Stack" button
    Image

    AWS: Create Stack

    2. Leave everything as is and add this S3 policy Template URL in the "Amazon S3 URL" field or copy the link from as shown below https://s3.amazonaws.com/public.voicebase.com/s3-ingress/voicebase-s3-ingress.json then click next

    Image

    S3 URL

    3. Now, on the stack details page fill in the fields per your use case.


    Copy these details to a notepad separately as we will need them later in the LivePerson Analytics Studio section of our setup.

    Make sure the "Company Name" field has only lower case letters with no spaces.

    AWS FieldCorresponding LivePerson Analytics Studio FieldValues
    Stack NameN/AThrive-Air-Data
    Company NameAnalytics Studio A/C Namethrive-air
    External IDExternal IDfd87c-672...
    VoiceBase RegionAnalytics Studio A/C RegionUS
    Image

    Stack Details

    You may have noticed an interesting field here-- External ID this is the link between your AWS account and your LivePerson Analytics Studio account. To get your unique External ID head over to the LivePerson Analytics Studio setup section. There go to the LivePerson Analytics Studio section further down on this page Copy the External ID from LivePerson Analytics Studio and paste it into this field. With all these details filled, click "Next" on the bottom right.

    On the following page, leave all fields as is and click "Next".

    Once you arrive on the 'Review' page. You can go though all the details you provided. Scroll all the way to the bottom and check-select "I acknowledge that AWS CloudFormation might create IAM resources with custom names." Then click "Submit" and voila! You've just created a stack!

    Image

    Capabilities

    It might take upto 3 minutes for AWS to create & update fields in your stack. Be sure to pause for a couple of minutes before you head to the next step.


    AWS S3 Bucket Folder Creation

    If no errors were raised so far and the process up till this point were completed successfully then you should have an S3 Bucket automatically created for you.

    1. Navigate to your S3 buckets page. You can find all your S3 Buckets at the following link: AWS S3 Buckets
    2. Once here, open the bucket with the following name format: vb-[CompanyName]-s3-bucket This is the same "Company Name" you entered in the AWS Stack Creation section earlier.
    3. Now, Click on "Create Folder". As for the name, make sure it follows this format: [CompanyName]-folder, then click on "Create Folder". Now you have a place on S3 where all your LivePerson Analytics Studio data will stay.


    Image

    Folder Creation

    Fetching details from AWS for Analytics Studio

    In this step we fetch resource details from AWS and fill them into our LivePerson Analytics Studio Analytics Studio setup page.

    1. Head over to "Stack Details" page of the newly created Stack. There, click on the "Outputs" Tab. Now copy and paste the following values from AWS to Analytics Studio:
    ValuesAWS FieldLocationCorresponding LivePerson Analytics Studio Field
    arn:aws...IAMRoleArnCloud Formation -> Stack Details -> OutputsS3 ROLE
    vb-thrive-air-s3-bucketS3BucketCloud Formation -> Stack Details -> OutputsBUCKET
    thrive-air-folderFolderhttps://s3.console.aws.amazon.com/s3/buckets -> vb-[CompanyName]-s3-bucketFOLDER


    Image

    AWS field details

    Setup AWS IAM Roles & Policies

    In this section we setup the required roles, permissions and policies to allow data transfer between AWS and Analytics Studio.

    1. In your AWS console, navigate to the Roles section on the IAM page. You can do this by clicking on the top left Services -> All Services -> Security & Identity -> IAM -> Roles or the following link: AWS IAM Dashboard / Roles
    2. Locate the role associated with the stack you created in section 2 of this guide : AWS Stack Creation.
    Image

    IAM Roles


    3. Once on the roles page, click on the tiny plus sign on the left of the policy name, a drop down opens up. Now, click on "Edit" to update the policy details.

    Image

    Update Policy

    4. Select all the text inside the policy and delete it. Now, paste in the underlying text as shown in the image below. Then click "Next", followed by "Save Changes"

    javascript:
    { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "s3:PutObject", "s3:GetObject", "s3:GetObjectVersion", "s3:DeleteObject", "s3:DeleteObjectVersion", "s3:ListBucket" ], "Resource": "*" } ] }
    Image

    Modify Permissions

    LivePerson Analytics Studio Setup

    Overview

    1. Access & Admin controls
    2. Analytics Studio Data Output setup
    3. Analytics Studio Data Output Setup with AWS details
    4. Live Person Live Connection setup
    5. Trigger LivePerson Analytics Studio Job Queue
    6. Check for files in S3

    Access & Admin controls

    Once signed into your LivePerson Analytics Studio account, On the left pane, navigate to Settings -> Users.
    Under Roles make sure you have the Admin role. If not, please ask LivePerson Customer Success.

    Analytics Studio Data Output setup

    Now click on "My Account" and scroll down to the section that reads "Analytics Studio Data Output - S3"
    Make sure you have "Enterprise Accelerator" enabled. If not, please reach out to LivePerson Support or Customer Success.
    If it is indeed enabled, your account page should look something like this:

    Image

    Analytics Studio Data output

    On the top right corner, click on Edit Account followed by Data Output

    Image

    Data Output


    You might notice a READ ONLY field by the name External ID. Make sure you copy the text into a notepad as it will come in handy with your AWS setup.
    The External ID field is the your LivePerson Analytics Studio Org ID aka LivePerson Analytics Studio Account ID. It auto-populates during this setup.
    In fact this is a good time to go back to your AWS account and input this value into the stack details page.

    Analytics Studio Data Output Setup with AWS details

    Once here, input the details from your S3 account as mentioned in Section: Your AWS Details in the AWS S3 Setup
    Once done, it should look something like this

    Here, FOLDER is a user defined field. You can name it whatever you like as long as the folder in your S3 bucket bears the same name.



    Image

    Data Output Details

    Trigger LivePerson Analytics Studio Sync Warehouse

    After having setup the IAM access roles and policies. We are now ready to trigger the job queue from Analytics Studio

    1. In your LivePerson Analytics Studio account, navigate to Manage Data -> Job Queue -> Add Job Queue

    Image

    Jobs Que

    2. Select the appropriate Created To / From dates from the drop down and click "Save" This will create request to sync data to your S3 bucket.

    Image

    Create Job

    This process takes about 10 to 15 minutes to complete. However, you may click on the refresh button on the top right corner to see the latest status updates.

    This is because the "Finished" refers to the ETL process having completed. The data is yet to be transferred and may take a couple of minutes to appear in your S3 bucket.



    Check for files in S3

    On your AWS console, navigate to to folder you created for the data transfer. The same folder you provided Analytics Studio with. You should now see a new folder created with the current year and sub folders for months and days followed by hours. For more details on the file structure and its purpose, please see the section "S3 Folder & File structure" further down on this page.

    Image

    Check for files in S3

    S3 Folder & File Structure

    Image

    S3 Folder


    As seen in the image above. Our thrive-air-folder from earlier now has a new sub-folder created inside of it.

    Image

    S3 Bucket Folder and Sub folder

    It contains additional sub-folder in the following hierarchy

    bash:
    All your S3 Buckets └── vb-mycompanyinc-s3-bucket [ The bucket you created for LivePerson Analytics Studio ] └── thrive-air-folder [ The folder you mentioned in LivePerson Analytics Studio ] └── 2023 [ Year of folder creation / Data sync ] └── 07 [ Month ] └── 20 [ Day ] └── 14 [ Hour ] └── 2023-07-20-14-20-04-49106ac2-... [ Minute & Unique hash ] ├── manifest.json [] ├── parquet/ [ Folder that contains the actual parquet data files ] └── upload.complete [ Used for upload notification ]

    Overall folder structure

    Let's dive deeper into the purpose of the critical folders and files on here.

    The manifest.json tells you what files you actually have in your parquet folder. It is also used as a debugging mechanism to figure out where the source data came from.
    The upload.complete is used for SQS notifications. You can hang a notification off of the upload.complete file. The moment your aws bucket is hit with new data, you will get an SQS notification. The JSON file itself mentions the exact name and path of the data (parquet files).
    The parquet/ folder itself houses all the parquet files.

    If you put a notification on the parquet folder, you will receive a notification even before the files have finished writing. Clearly, that's not ideal at all. Which is why we have the upload.complete that gets written only after all the data parquet files have been fully and completely written.


    Here's a look at the demo S3 URI

    s3://vb-mycompanyinc-s3-bucket/thrive-air-folder/2023/06/29/17/2023-06-29-17-17-59-5fb2a35e-8a9d-4c70-888e-d475d8a72001/

    This is deliberate. It is standard AWS practice and the reason for it is to avoid complexity and clutter. The idea being-- so that we do have 100+ files in a single folder. If you do, it would be rather difficult to locate them and work with them programatically unless you know the exact key.


    Image

    Manifest.json structure

    The parquet folder

    Each parquet file inside the parquet folder is a representation of an SQL table. You can think of it as each conversation being processed into a SQL-type relational schema with rows and columns.
    For a detailed explanation of each file, field and rows data in the parquet. Pease head to this page. 

    The annoyance with parquet is that you need a driver to open and view the data since the data is stored in a columnar fashion. However the pros far out weight the cons. You can snap a JDBC driver, connect via a BI platform or load via Python and the data is already indexed. This makes it incredibly fast and performative.


    Customer Managed Policy and Inline Policy are two types of identity-based policies in AWS IAM. Here is a breakdown of each type and when to use them:
    Customer Managed Policy:

    • A Customer Managed Policy is a standalone policy that you create and administer within your own AWS account.
    • It can be attached to multiple users, groups, and roles within your account.
    • You can customize an existing AWS Managed Policy to fit the specific requirements of your organization.
    • Customer Managed Policies are recommended when the existing AWS Managed Policies do not meet the needs of your environment.
    • They provide more control and flexibility over the permissions defined in the policies.
    • When you make changes to a Customer Managed Policy, IAM creates a new version of the policy and stores up to five versions.
    • You can revert to an earlier version of the policy if needed.

    Inline Policy:

    • An Inline Policy is a policy that is created and embedded directly into a single IAM identity (user, group, or role).
    • It maintains a strict one-to-one relationship between the policy and the identity.
    • When you delete the identity, the inline policy is also deleted.
    • Inline Policies are useful when you want to ensure that the permissions in a policy are only assigned to the intended identity and not inadvertently assigned to any other identity.
    • They are typically used when you need a policy that is specific to a single user, group, or role.
    • If a policy could apply to multiple entities, it is better to use a Managed Policy instead.

    In most cases, AWS recommends using Managed Policies over Inline Policies. Managed Policies provide better manageability, versioning, and the ability to apply the same policy to multiple principal entities. Inline Policies are useful for maintaining strict control over the permissions assigned to a single identity.


    This is deliberate. It is standard AWS practice and the reason for it is to avoid complexity and clutter. The idea being-- so that we do have 100+ files in a single folder. If you do, it would be rather difficult to locate them and work with them programatically unless you know the exact key.


    The parquet folder contains all your LivePerson Analytics Studio data.

    The manifest.json gives a further description of the data itself and is used for debugging purposes.

    The upload.complete is used to issue automated notifications of a new data dump. Ie. You will get a SQS notification


    Missing Something?

    Check out our Developer Center for more in-depth documentation. Please share your documentation feedback with us using the feedback button. We'd be happy to hear from you.