arrow-left arrow-right brightness-2 chevron-left chevron-right circle-half-full facebook-box facebook loader magnify menu-down rss-box star twitter-box twitter white-balance-sunny window-close
Day 4: Run in production + API
6 min read

Day 4: Run in production + API

Today we’ll run the app in production and provide an API to allow us to query by userId and anonymousId

Create basic API

Add these functions to your serverless.yml

getAnonymous:
  handler: src/handlers/api/getAnonymous.handler
  events:
    - http: get /api/anonymous/{id}
getUser:
  handler: src/handlers/api/getUser.handler
  events:
    - http: get /api/user/{id}

This will make the serverless.yml file look like this

service: solving-marketing-attribution

custom:
  # sma = Solving Marketing Attribution
  tableIdentify: 'sma-identify-${self:provider.stage}'
  tablePage: 'sma-event-page-${self:provider.stage}'
  tableAttribution: 'sma-event-attribution-${self:provider.stage}'
  tableUserMapping: 'sma-event-user-map-${self:provider.stage}'

provider:
  name: aws
  runtime: nodejs10.x
  stage: dev
  region: eu-west-1
  iamRoleStatements:
    - Effect: Allow
      Action:
        - dynamodb:Query
        - dynamodb:Scan
        - dynamodb:GetItem
        - dynamodb:PutItem
        - dynamodb:UpdateItem
        - dynamodb:DeleteItem
        - dynamodb:ListStreams
      Resource:
        - { "Fn::GetAtt": ["SegmentIdentifiesDynamoDBTable", "Arn" ] }
        - { "Fn::GetAtt": ["SegmentPageDynamoDBTable", "Arn" ] }
        - { "Fn::GetAtt": ["SegmentAttributionDynamoDBTable", "Arn" ] }
        - { "Fn::GetAtt": ["SegmentUserMappingDynamoDBTable", "Arn" ] }
  environment:
    IDENTIFY_TABLE: ${self:custom.tableIdentify}
    PAGE_TABLE: ${self:custom.tablePage}
    ATTRIBUTION_TABLE: ${self:custom.tableAttribution}
    USER_MAP_TABLE: ${self:custom.tableUserMapping}

functions:
  hello:
    handler: index.handler
    events:
      - http: 'POST /events'
  processPage:
    handler: processPage.handler
    events:
      - stream:
          type: dynamodb
          batchSize: 1
          startingPosition: LATEST
          arn:
            Fn::GetAtt:
              - SegmentPageDynamoDBTable
              - StreamArn
  processIdentify:
    handler: processIdentify.handler
    events:
      - stream:
          type: dynamodb
          batchSize: 1
          startingPosition: LATEST
          arn:
            Fn::GetAtt:
              - SegmentIdentifiesDynamoDBTable
              - StreamArn
  getAnonymous:
    handler: src/handlers/api/getAnonymous.handler
    events:
      - http: get /api/anonymous/{id}
  getUser:
    handler: src/handlers/api/getUser.handler
    events:
      - http: get /api/user/{id}

resources:
  Resources:
    SegmentIdentifiesDynamoDBTable:
      Type: 'AWS::DynamoDB::Table'
      Properties:
        StreamSpecification:
          StreamViewType: NEW_IMAGE
        AttributeDefinitions:
          - AttributeName: messageId
            AttributeType: S
        KeySchema:
          - AttributeName: messageId
            KeyType: HASH
        ProvisionedThroughput:
          ReadCapacityUnits: 1
          WriteCapacityUnits: 1
        TableName: ${self:custom.tableIdentify}
    SegmentPageDynamoDBTable:
      Type: 'AWS::DynamoDB::Table'
      Properties:
        StreamSpecification:
          StreamViewType: NEW_IMAGE
        AttributeDefinitions:
          - AttributeName: messageId
            AttributeType: S
        KeySchema:
          - AttributeName: messageId
            KeyType: HASH
        ProvisionedThroughput:
          ReadCapacityUnits: 1
          WriteCapacityUnits: 1
        TableName: ${self:custom.tablePage}
    SegmentAttributionDynamoDBTable:
      Type: 'AWS::DynamoDB::Table'
      Properties:
        AttributeDefinitions:
          - AttributeName: anonymousId
            AttributeType: S
          - AttributeName: eventId
            AttributeType: S
        KeySchema:
          - AttributeName: anonymousId
            KeyType: HASH
          - AttributeName: eventId
            KeyType: RANGE
        ProvisionedThroughput:
          ReadCapacityUnits: 1
          WriteCapacityUnits: 1
        TableName: ${self:custom.tableAttribution}
    SegmentUserMappingDynamoDBTable:
      Type: 'AWS::DynamoDB::Table'
      Properties:
        AttributeDefinitions:
          - AttributeName: userId
            AttributeType: S
          - AttributeName: anonymousId
            AttributeType: S
        KeySchema:
          - AttributeName: userId
            KeyType: HASH
          - AttributeName: anonymousId
            KeyType: RANGE
        ProvisionedThroughput:
          ReadCapacityUnits: 1
          WriteCapacityUnits: 1
        TableName: ${self:custom.tableUserMapping}
serverless.yml

Then we need the two handlers for those API functions. Notice how we create src/handler/api a directory which we’ll use to move our handlers over time.

All the code to handle the API is already there (we added it in Day 3) so the handlers are pretty simple:

// /src/handlers/api/getUser.js

const { withStatusCode } = require('../../utils/response.util');
const dynamoDBFactory = require('../../dynamodb.factory');
const { UserToAnonymousModel } = require('../../models/UserToAnonymous');
const { SourceAttributionModel } = require('../../models/SourceAttribution');

const dynamoDb = dynamoDBFactory();
const model_usermap = new UserToAnonymousModel(dynamoDb);
const model_source = new SourceAttributionModel(dynamoDb);

const ok = withStatusCode(200, JSON.stringify);

exports.handler = async (event) => {
    const { id } = event.pathParameters;

    const anonymousIds = await model_usermap.getAnonymousIdsForUser(id);
    // get all things from sessions with those anonymousId
    const attributionSessions = await model_source.getForAnonymousIds(anonymousIds);

    return ok({
        anonymousIds : anonymousIds,
        sessions: attributionSessions
    });
};
/src/handlers/api/getUser.js
// /src/handlers/api/getAnonymous.js

const { withStatusCode } = require('../../utils/response.util');
const dynamoDBFactory = require('../../dynamodb.factory');
const ok = withStatusCode(200, JSON.stringify);

const { SourceAttributionModel } = require('../../models/SourceAttribution');

const dynamoDb = dynamoDBFactory();
const model_source = new SourceAttributionModel(dynamoDb);

exports.handler = async (event) => {
    const { id } = event.pathParameters;

    const attributionSessions = await model_source.getForAnonymousId(id);

    return ok({
        sessions: attributionSessions
    });
};
/src/handlers/api/getAnonymous.js

Deploy this with a quick sls deploy

> sls deploy
Serverless: Packaging service...
Serverless: Excluding development dependencies...
Serverless: Uploading CloudFormation file to S3...
Serverless: Uploading artifacts...
Serverless: Uploading service solving-marketing-attribution.zip file to S3 (10.52 MB)...
Serverless: Validating template...
Serverless: Updating Stack...
Serverless: Checking Stack update progress...
.......................................................................
Serverless: Stack update finished...
Service Information
service: solving-marketing-attribution
stage: dev
region: eu-west-1
stack: solving-marketing-attribution-dev
resources: 38
api keys:
  None
endpoints:
  POST - [$BASE_URL]/events
  GET - [$BASE_URL]/api/anonymous/{id}
  GET - [$BASE_URL/api/user/{id}
functions:
  hello: solving-marketing-attribution-dev-hello
  processPage: solving-marketing-attribution-dev-processPage
  processIdentify: solving-marketing-attribution-dev-processIdentify
  getAnonymous: solving-marketing-attribution-dev-getAnonymous
  getUser: solving-marketing-attribution-dev-getUser
layers:
  None
Serverless: Run the "serverless" command to setup monitoring, troubleshooting and testing.

Testing the API and Event Processing

To test the attribution and anonymous to userId mapping a little better let’s feed the system some events. I have created some sample events (all JSON files). So download those and put them in the /events folder.

Archive of segment events (5 in total).

mkdir events && cd events
wget https://ucarecdn.com/a75fc569-fcd0-4671-bd4f-2c93110234ca/sample_events.zip
unzip sample_events.zip
cd ..

Now make sure all your dynamoDB sma tables are empty and run the following command to POST each event in one go

// easy way (all in one go)
for d in ./events/* ; do (http POST $BASE_DOMAIN/events < "$d" ); done

// hard way (one by one)
http POST $BASE_DOMAIN/events < events/anon_page_1.json
http POST $BASE_DOMAIN/events < events/anon_page_2.json
...

You will now be able to query users by anonymousId or userId and get a list of their sessions and referral sources.

Testing by anonymousId

http GET $BASE_DOMAIN/api/anonymous/da8011b3-19bf-4d3d-b1b6-caa961cb5161

Should give you something like this:

http GET $BASE_DOMAIN/api/anonymous/da8011b3-19bf-4d3d-b1b6-caa961cb5161
HTTP/1.1 200 OK
Connection: keep-alive
Content-Length: 783
Content-Type: application/json
Date: Sat, 25 Apr 2020 09:15:33 GMT
Via: 1.1 320b04684a5b16980772c5d36c63ecea.cloudfront.net (CloudFront)
X-Amz-Cf-Id: ONXCrxrOY2wufV6Gu8UtVjaCER_PHtWtA1JUw3Dx4fFK8OHQN1nL4g==
X-Amz-Cf-Pop: LHR61-C2
X-Amzn-Trace-Id: Root=1-5ea3ffb5-b7e30caadb18a74e68163166;Sampled=0
X-Cache: Miss from cloudfront
x-amz-apigw-id: LiTkUG-4joEFrgQ=
x-amzn-RequestId: 3e1dd584-da7d-40fa-9527-ecb4d7e9f2e1

{
    "sessions": [
        {
            "anonymousId": "da8011b3-19bf-4d3d-b1b6-caa961cb5161",
            "eventId": "2020-03-25T08:35:18.281Z-ajs-9662bd147ae61e72db92055f52810bd9",
            "messageId": "ajs-9662bd147ae61e72db92055f52810bd9",
            "referrer": {
                "engine": "google",
                "host": "www.google.co.in",
                "type": "search"
            },
            "referrerUrl": "https://www.google.co.in/",
            "timestamp": "2020-03-25T08:35:18.281Z",
            "url": "https://www.prezly.com/academy/relationships/crisis-communication/the-best-managed-pr-crises-of-2018",
            "userId": null
        },
        {
            "anonymousId": "da8011b3-19bf-4d3d-b1b6-caa961cb5161",
            "eventId": "2020-04-25T08:57:53.206Z-ajs-3b9ef982cdc9fde05a313068b7b39be5",
            "messageId": "ajs-3b9ef982cdc9fde05a313068b7b39be5",
            "referrer": {
                "type": "direct"
            },
            "referrerUrl": null,
            "timestamp": "2020-04-25T08:57:53.206Z",
            "url": "https://www.prezly.com",
            "userId": null
        }
    ]
}

And testing by userId

http GET $BASE_DOMAIN/api/user/john@gmail.com

Should give you something like this:

http GET $BASE_DOMAIN/api/user/john@gmail.com
HTTP/1.1 200 OK
Connection: keep-alive
Content-Length: 976
Content-Type: application/json
Date: Sat, 25 Apr 2020 09:17:07 GMT
Via: 1.1 5da47734f496c05ba90c546c024fb779.cloudfront.net (CloudFront)
X-Amz-Cf-Id: hfpCCsc632rf_84U_bPXU2HRZep6oK_U4vnHQTGoHT3olX7BmoXq1w==
X-Amz-Cf-Pop: LHR61-C2
X-Amzn-Trace-Id: Root=1-5ea40012-442f3f982a8b83b4481e7856;Sampled=0
X-Cache: Miss from cloudfront
x-amz-apigw-id: LiTy7Eg3DoEF1aA=
x-amzn-RequestId: 3e60f0df-3f3a-4578-be57-188d5f697e8e

{
    "anonymousIds": [
        "f1ef0f8d-68d2-4fbc-a7e6-cade9d42f3aa"
    ],
    "sessions": [
        [
            {
                "anonymousId": "f1ef0f8d-68d2-4fbc-a7e6-cade9d42f3aa",
                "eventId": "2020-04-25T08:27:08.257Z-ajs-534d538d5ccb72c0da98473f25c91f49",
                "messageId": "ajs-534d538d5ccb72c0da98473f25c91f49",
                "referrer": {
                    "network": "linkedin",
                    "type": "social"
                },
                "referrerUrl": "https://www.linkedin.com/",
                "timestamp": "2020-04-25T08:27:08.257Z",
                "url": "https://www.prezly.com/case-studies/m-and-c-communications?li_fat_id=7c4f0e4a-9902-49af-83d2-3d2c803037af",
                "userId": null
            },
            {
                "anonymousId": "f1ef0f8d-68d2-4fbc-a7e6-cade9d42f3aa",
                "eventId": "2020-04-25T08:45:34.315Z-ajs-26168b569103f6176c369916e2b74d4a",
                "messageId": "ajs-26168b569103f6176c369916e2b74d4a",
                "referrer": {
                    "engine": "google",
                    "host": "www.google.com",
                    "type": "search"
                },
                "referrerUrl": "https://www.google.com/",
                "timestamp": "2020-04-25T08:45:34.315Z",
                "url": "https://www.prezly.com/academy/relationships/corporate-social-responsibility/the-relationship-between-pr-and-csr",
                "userId": null
            }
        ]
    ]
}

So let’s see what we have here? This is the response coming from the /user API

One UserId with 3 sessions from different sources


Looks good. Not sure how I am going to digest/use that data and pass it back to segment but let’s consider that a problem for later.

Running in production

So we’re ready to test drive this in production. Open your serverless.yml and change the stage to prod

Quick sls deployshould create new tables, new event streams and new functions.

When the deployment is done search for the function that is linked to the POST endpoint (URL ending with /events). Copy that URL and log in to your segment account.

Go to your project > destinations > webhooks and enable the webhook to this URL.

enable the webhook in segment

After that you can go to the event tester to see if all works well

Now let’s let this integration run for a few days (it’s weekend after all) and check in on Monday if there are any issues/problems we need to solve.