Custom processors are Javascript scripts that you can run which allow you to transform data either before or after it's processed by Chain.io.
With a little bit of coding experience, you can use custom processors to make powerful enhancements to your flows to help with specific use cases like:
- adding or removing fields
- augmenting data with custom calculations
- making adjustments to data formats
- running extra validations
Types of Custom Processors
The are two type of custom processors, and they work the same way.
A pre-processor runs on the data in between when Chain.io receives it from the source and when your Source File Type adapter runs. Use this to alter or augment data before it runs through the processing engine.
A post-processor runs on the data in between when the Destination File Type adapter runs and when the data is delivered to its final location. Use this to alter or augment data after it runs through the processing engine.
How To Add A Custom Processor
You add Custom Processors by adding code in the Custom Processors section of the Edit Flow Screen. There is a separate data entry box for the pre-processor and post-processor. When you deploy the flow, the custom processor will be automatically invoked at the appropriate time during the flow execution.
How to Write A Custom Processor
Custom processors are scripts written in Javascript. They run in a sandboxed Node.js environment running Node 14 or higher. You write your code using normal Node.js javascript conventions with the following special exceptions:
- You cannot write asynchronous code, async/await and Promises are not allowed.
- You cannot require or import libraries.
- Your code must execute within 30 seconds or it will be terminated resulting in a failure state.
- You cannot use scheduling functions such as setTimeout or setInterval.
- You cannot access the Symbol object or it's related functions.
- Your code must be less than or equal to 10,000 characters in total. This limit is per custom processor so you can include both a 10,000 character pre-processor and a 10,000 character post-processor in the same flow setup.
Within your script, you'll have access to the following global variables:
- sourceFiles (pre-processors only): An array of File objects including the data that was submitted to the flow. You should expect zero or more File objects. You may receive multiple file objects in certain scenarios like an email source connection where the body and each attachment are sent as separate file objects. You may need to iterate through the sourceFiles to identify the one you need to work with.
- destinationFiles (post-processors only): An array of File objects including the data that was returned by the flow. You should expect zero or more File objects. You would receive an empty array if the destination file adapter did not return any data and you may receive multiple files if the output adapter returns multiple objects. For example, you may submit a CSV with multiple shipments which are output as separate shipment registration requests to a CO2 provider. In that case, each registration request would be represented by a different File object.
- userLog: An object that you can use to write messages that the user will see in the flow execution screen. The object has 3 functions which all take a single String variable: info(), warning(), and error().
- returnSuccess, returnSkipped, returnError: Call one of these functions as the last line that executes in your script and pass in an array of 0 or more file objects.
- lodash: An instance of the lodash library at version >= 4.17.21. This library is helpful for accomplishing many common javascript tasks for handling objects and arrays without writing extra code.
- xmldom: An instance of @xmldom/xmldom at version >= 0.8.8 that you can use to parse XML files.
- xpath: An instance of xpath at version >= 0.0.32 that you can use with xmldom to quickly access values in xml files via xpath notation.
- xml - An instance of our xml helper package that wraps both xmldom and xpath into an easy to utilize module making them both much easier to work with to parse, query, and write xml data. See below for an example
-
xml.XmlParser
-
parseFromString(string, optionsObject) - static method that reads a string and converts it to a Document. By default, it strips namespaces. To retain namespaces provide a boolean true as the strip_namespaces option (ex `xml.XmlParser.parseFromString(xmlString, { strip_namespaces: false})`)
-
-
rootElement() - instance method that returns the root element of the Document
-
xml.Document
-
createDocument(rootElementName) - Status method to create a new Document. rootElementName should be a string. If document belongs to a namespace, the rootElementName should reflect that namespace alias (ex. `xml.Document.createDocument('ns:rootElementName')`)
-
-
xml.XmlSerializer
- serializeToString(documentOrElement) - instance method to convert a Document or Element to a string. By default, minifies the document removing all uneccessary whitespace between elements. To pretty print the document with defaults of 2 space indentation use the constructor format option with a blank object. (ex `new xml.XmlSerializer({ format: {}}).serializeToString(document)`)
- xml.elements(element, xpathExpressionString) - function to return all non-namespaced elements that match the xpath expression
- xml.element - function to return the first non-namespaced element that matches the xpath expression
- xml.text - function to return the text of the first non-namespaced element that matches the xpath expression
- xml.date - function to return a luxon DateTime object the first non-namespaced element that matches the xpath expression. date string is assumed to be in ISO-8601 format.
- xml.integer - function to return JS number representing an integer value from the element text matching the xpath expression.. If the xml text is a decimal value, the value will be rounded using the standard round half-up rules.
- xml.decimal - function to return a Decimal.js decimal object from the element text matching the xpath expression.. Use this function to avoid working with imprecise float values.
- xml.boolean - function to return a boolean value from the element text matching the xpath expression. The values 'true', '1', and 'yes' will evaluate to true anything else false.
-
xml.Xpath - a class to use to provide namespaced versions of all the above xpath helper functions.
Example:
const document = xml.XmlParser.parseFromString(xmlString, { strip_namespaces: false })
const namespacedXpath = new xml.Xpath({ ns1: 'http://chain.io/namespace/one', ns2: 'http://chain.io/namespace/two'})
const codes = namespacedXpath.elements(document, '/ns1:events/ns1:event').map(eventElement => namespacedXpath.text(eventElement, 'ns1:code'))
-
publishDataTags: A function that publishes “Data Tags” to the Flow Execution Screen. (See example below.)
-
label - string - required - The "Label" for the data tag. "Forwarder Ref" in the below example.
-
value - string - required - The "Value" for the data tag. "CDE1009317" in the below example.
-
- DateTime: An instance of the Luxon DateTime object at version >= 3.3.0 that you can use to manipulate times and dates.
-
uuid: A v4 compliant uuid generator. Usage:
uuid() // 51639e8d-a12b-4bba-be4b-e2337b166a9c
A Hello World Example
This simple example replaces the body of every file in the flow with "Hello World"
const myFiles = sourceFiles.map((f) => {
f.body = 'Hello World!'
return f
})
returnSuccess(myFiles) // the output of your last command in the script will be returned
An Example Using XML Processing
Let's say you have a system that submits XML data to you that you want to quickly convert into Chain.io Standard Event JSON for further processing. You also only want to capture arrival dates and want your flow to skip on any other dates.
Assume the XML looks like this:
<events>
<master_bill>MBOL123</master_bill>
<event>
<!-- time in epoch seconds -->
<actual>1542674993</actual>
<code>Arrival</code>
</event>
<event>
<actual>1542674000</actual>
<code>Departure</code>
</event>
</events>
You could write a pre-processor script like this
const makeEventJSON = (event, masterBill) => {
// read the epoch seconds as an integer value
const epochSeconds = xml.integer(event, 'event_date')
// Use the luxon DateTime module to convert the date to the ISO 8601 format
const eventDate = DateTime.fromSeconds(epochSeconds).toUTC().toISO()
const estimated = xml.boolean(event, 'estimated')
// If the data from the source file was already formated in ISO 8601 format,
// we could have used the xml.date function - which returns a Luxon DateTime object
// const actualDate = xml.date(event, 'actual')?.toUTC()?.toISO()
const code = xml.text(event, 'code')
return {
actual_date: estimated ? undefined : eventDate,
estimated_date: estimated ? eventDate : undefined,
event_code: code,
master_bill: masterBill
}
}
const handleFile = (file) => {
// get the body
const source = file.body
// parse the xml - by default parseFromString will remove namespaces.
// If you're simply extracting data, then this is ok, however if you're.
// modifying an xml document in place, you typically do not want to remove namespaces.
let xmlDocument
try {
xmlDocument = xml.XmlParser.parseFromString(source, { strip_namespace: false })
} catch (err) {
userLog.warning(`Skipping invalid xml file ${file.file_name}.`)
return
}
// only process files that have the xml we're looking for
if (xmlDocument.rootElement().localName() !== 'events') {
userLog.warning(`Unexpected root element in file ${file.file_name}. Expected <events>. Skipping file.`)
return
}
// use xpath to only get the Arrival event we care about
// We could have also used the 'elements' xpath function to return all events
const arrivalEvent = xml.element(xmlDocument, '/events/event[code = "Arrival"]')
if (!arrivalEvent) {
userLog.warning("No Arrival events found, so we're not going to pass anything into the rest of the flow")
return
}
// Extract the text from the master_bill element
const masterBill = xml.text(xmlDocument, '/events/master_bill')
// create the JSON representation of the events
const eventJSON = xml.elements(xmlDocument, '/events/event').map((eventElement) => makeEventJSON(eventElement, masterBill))
// create Chain.io standard event json structure
const standardJSON = {
doc_type: 'event_json',
events: eventJSON
}
// replace the body with the standard json and return the file
return {
...file,
body: JSON.stringify(standardJSON)
}
}
userLog.info('Beginning custom script')
// process each matching file and use the result as the return value of the script
// filter out any results that were skipped (.ie they returned an undefined value)
const output = sourceFiles.map(handleFile).filter(x => x)
if (output.length === 0) {
// If there are no files that we've chosen to process, cause the flow to stop
// and show a 'skipped' status
returnSkipped([])
} else {
returnSuccess(output)
}
Given this script, the output would be
[
{
type: 'file',
body: '{"doc_type":"event_json","events":[{"actual_date":"2018-11-20T00:49:53.000Z","event_code":"Arrival","master_bill":"MBOL123"},{"estimated_date":"2018-11-20T00:33:20.000Z","event_code":"Departure","master_bill":"MBOL123"}]}'
}
]
Note: Any files with invalid xml or without a root element of 'events' would result in the flow execution being skipped and warnings written to the execution log in the Portal.
Custom Data Tags
The example code below shows an example of code used in the post processor for adding your own custom data tags to your Flow execution.
publishDataTags([{label: 'Forwarder Ref', value: 'CDE1009317'}]) // pass in an array of data tags
publishDataTags({label: 'Forwarder Ref', value: 'CDE1009317'}) // pass in a single data tag object
publishDataTags('invalid pre-proc tag') // invalid
Comments
0 comments
Article is closed for comments.