Creating an API for an in-house application

Trelica is a SaaS management platform that helps IT teams understand and manage the applications and assets in use. This can include building an inventory of all applications, as well as helping with reviewing access and managing onboarding and offboarding.

Trelica offers a large library of connectors to popular SaaS applications, but some customers have systems, built in-house, that they want to connect to Trelica. 

Where an API already exists then if you can share documentation then we can evaluate it in terms of ease of creating an interface.

However, if you haven’t got an API in place and are looking to create one, this document explains some of the considerations which in our experience will help build a good user management API.

The focus is primarily on reading user data, and offboarding users, as these are simple and common use-cases for Trelica.

Technical approach

In terms of approach, we are used to dealing with hundreds of different APIs, so our preference is for you to use whatever is most secure, maintainable and efficient for your technical team.

If you already have frameworks or tooling in place to quickly deliver APIs, then please use these and we will accommodate.

These requirements assume a simple REST API - other approaches would be things like GraphQL, but would likely add a high level of additional complexity, unless this is already in use and easily repurposed.

If SCIM is already in place then it can be extended with custom schemas to return additional data.

The rest of this article assumes you are writing the API from scratch and recommends approaches to take and pitfalls to avoid.

API recommendations for Trelica

Versioning

Whilst versioning is a consideration, it’s probably sufficient just to plan for this by putting a version into the URL, e.g. /api/v1

Authentication

Our recommended option would be OAuth2 Client Credentials, but a simple API token in a header would be fine too.

Client credentials

The benefit of client credentials is that the credentials are only exchanged at the start of the session and a short TTL access token is used for the rest of the API requests which reduces the chances of tokens leaking in logs etc.

Secondly, it encourages an API which uses authorization scopes.

E.g. something along the lines of this to issue the token:

curl https://api.yourorg.com/api/v1/connect/token \
--user "<CLIENT_ID>:<CLIENT_SECRET>" \
--data "grant_type=client_credentials&scope=Users.Read"

Access token

Client credentials imposes some additional development overhead to implement correctly, even if you’re using an existing library.

Another good alternative is a strong token being passed as an HTTP header e.g.

Authorization: Bearer <token>

Or

x-api-key: <token>

Tokens should never be passed in the query string.

JSON

Ideally be consistent, e.g. camelCase or snake_case for attributes.

Dates should be in UTC, and use RFC 3339 strings. E.g.

lastLogin: "2023-11-17T12:42:22Z"

Pagination

We recommend paginating larger datasets. Trelica’s integration platform can deal with pretty much any pagination approach.

Offset / limit is possibly the simplest approach. 

Each request specifies an offset and limit (the number of items to return). E.g.

https://api.yourorg.com/api/v1/users?offset=300&limit=100

Things to watch for: 

  • Is the offset 0 or 1 based?
  • Offset and limit should default to certain values if not passed.
  • What happens if a request is made with an offset outside the bounds of the data? (e.g. offset=1000 where there are 920 items). Probably just return no data (i.e. an empty array).
  • How does the client know when to stop making requests? A common approach is to either
    • Return 0 rows in the array
    • Add a hasMore attribute or potentially to add a JSON attribute for the next page

E.g.

{
    nextPage: "https://api.yourorg.com/api/v1/users?offset=100&limit=100"
    users: [
        ...
    ]
}

No nextPage would imply that this is the last page.

Other approaches would be a link header but these need to be structured quite carefully to adhere to the RFC.

It is critical to sort the dataset for pagination to work correctly. With every API request, we would need the data to be sorted in the same way so that pages are delivered consistently.

The sort attribute should ideally be unique to avoid edge cases (e.g. sorting on the primary key is often best).

A final consideration is if the dataset changes mid-way through the request. E.g. between the first and last page request, 10 new users get added. Typically this is acceptable to acknowledge that this is unlikely to happen, and were it to happen it wouldn’t cause particular problems, at least from Trelica’s perspective.

Endpoints

List users

A typical response structure that we might expect:

{
    nextPage: "https://...."
    users: [{
        id: "123456", 
        firstName: "Jane",
        lastName: "Doe",
        email: "jane.doe@Example.com",
        roles: ["Administrator"],
        lastActivity: "2023-11-17T12:42:22Z",
        created:  "2021-10-12T00:42:19Z", 
        status: "Active"
    }, {
...     }] }

A unique identifier is useful and would be used for additional requests to identify the user (e.g. for deprovisioning). If no unique identifier is available, and if email address is unique, then this would suffice.

Names are ideally first/last name separately, but Trelica can accept a full name if easier. If a preferred name vs legal first name attribute is available we take the preferred name on the basis that most communication will use this name, and the email address is likely based upon it.

User status - ideally explicitly return this as

  • Active
  • Suspended (temporarily has no access)
  • Inactive (has been deactivated, probably permanently prior to deletion)

If a user is deleted then it’s fine just to not return them and Trelica will assume deletion.

Created date is useful if it can be provided. Last updated date can be passed too but isn’t critical.

Roles are helpful - either simple strings or IDs. If roles change, or if you want to set roles from Trelica, then an endpoint that lists the currently valid roles would be advisable.

Other attributes that can be useful (but would more likely come from an HR system):

  • Employee ID
  • Department
  • Cost center
  • Job title
  • Start date
  • Termination date

We can also support any other custom attribute that might be relevant to you.

Groups and group membership

Optional, but if included we would recommend:

  • Separate endpoint (paginated) to list groups
  • Separate endpoint (paginated) to list members of a specific group

E.g.

https://api.yourorg.com/api/v1/groups
https://api.yourorg.com/api/v1/groups/G542X/members

Some APIs return group membership as part of a user object but generally it’s better to have separate group membership endpoints as they lend themselves more easily to extending the API to include add / remove group member, create new group etc.

Where possible, it is a nice optimization if the endpoint that lists all groups returns the date that the membership of each group last changed (to avoid refetching members). 

Deactivation

This can be done via a PUT or PATCH operation on a user:

E.g.

curl -X PUT https://api.yourorg.com/api/v1/users/U5521K \
--data '{ status: "Inactive"}'

Or via a specific ‘lifecyle’ REST method:

curl -X POST https://api.yourorg.com/api/v1/users/U5521K/deactivate

If implementing via PUT or PATCH we’d recommend treating lack of an attribute in the JSON as indicative of ‘no-change’ - i.e. don’t update the name to null, if no name is passed.

Audit log endpoint

Audit log data can be used to provide additional context or more detailed login history.

The sorts of fields to include in the audit log endpoint would include:

  • A unique audit event ID. This is helpful for pagination and also so that we can avoid reprocessing the same event.
  • Event date/time
  • Actor
    • User ID
    • Email. This is useful to include as it avoids having to pre-fetch users with IDs, or if an application (e.g. Trelica) is fetching your audit logs on more regular schedule than it fetches users, and a new user is created.
    • IP address. This is useful for security diagnostics.
  • Target
    • The object type acted on e.g. 'user' or 'account'
    • The id or other information about the object acted on
  • Event type
    • Clearly defined enumerated list of events, e.g. login, user_updated, account_deleted. You could consider classifying events by severity.
  • Action specific details, e.g. for Login
    • Log in method (e.g. SAML2, password)

Audit logs tend to be large so there are some specific considerations:

  1. The end-point must be filterable in particular for:
    • events since a certain date/time
    • events of a specific type (e.g. successful login, failed_login)
  2. To be performant, pagination should probably use continuation tokens. This is an excellent article on how to implement token based pagination.

Was this article helpful?

0 out of 0 found this helpful

Comments

0 comments

Please sign in to leave a comment.