Map/Reduce scripting in NetSuite enables the efficient processing of large data sets by dividing tasks into manageable stages. Below is an explanation of key terms and concepts related to Map/Reduce scripts.
Buffer Size
The buffer size is an option on the script deployment record that determines the number of key-value pairs a map or reduce job can process before saving progress to the database. As a best practice, this value is typically set to 1.
Deployment Instance
A deployment instance is a task created to process a script deployment. When a deployment is submitted for processing, a deployment instance is generated. Only one unfinished instance of a specific script deployment can exist at any time.
exitOnError
This configuration option dictates the behavior of a script when an uncaught error occurs during a map or reduce function. If set to true, and retries are exhausted, the script exits the current stage and proceeds to the summarize stage.
Function Invocation
A function invocation refers to the execution of a function. In Map/Reduce scripts, functions like getInputData and summarize are invoked once, whereas map and reduce functions may be invoked multiple times, depending on the input data.
getInputData
This is the initial stage of a Map/Reduce script. In this stage, the script returns an object that can be transformed into key-value pairs. The getInputData entry point identifies the function executed during this stage.
Governance
Governance refers to the system of rules regulating resource usage in NetSuite. Each script type, including Map/Reduce, has specific governance limits.
Hard Limit
A hard limit is a governance restriction that, if violated, causes the current function invocation to terminate. Adhering to best practices helps avoid these interruptions.
Job
A job represents a unit of execution managed by SuiteCloud Processors. Map/Reduce scripts are processed through multiple jobs, with at least one job per stage.
Map Stage
The second stage of a Map/Reduce script, where data from the getInputData stage is processed. While optional, either a map or reduce function must be used in a script.
Map/Reduce
A computing paradigm used for processing large data sets. In NetSuite, it serves as the foundation for the Map/Reduce script type.
Map/Reduce Script Type
A SuiteScript 2.x script type that uses the Map/Reduce paradigm to divide large-scale processing into smaller tasks.
Parallel Stage
A stage, such as map or reduce, that can be processed simultaneously by multiple jobs. This is in contrast to serial stages, which are handled by a single job.
Priority
The priority of a job determines the order in which it is processed. Priorities can be set in the script deployment record or on the SuiteCloud Processors Priority Settings page.
Processor
A processor is a virtual unit of computing power used by SuiteCloud Processors to execute a job. It is a single thread rather than a physical entity.
Processor Pool
The processor pool represents the total number of processors available to a NetSuite account. The concurrency limit on the script deployment record determines the processors allocated to a specific script.
Reduce Stage
The fourth stage of a Map/Reduce script. A reduce function processes data from the getInputData or map stage. It is invoked once for each unique key in the dataset.
retryCount
This configuration determines whether the system reprocesses map or reduce functions that were interrupted due to errors or server restarts.
Script Deployment
A script deployment defines how a script is executed. It includes configuration settings and must be submitted for processing to run a Map/Reduce script.
Serial Stage
Stages processed entirely by a single job, such as getInputData, shuffle, and summarize. These stages contrast with parallel stages.
Shuffle Stage
The third stage of a Map/Reduce script, responsible for sorting data into unique keys and arrays. This step prepares data for the reduce stage.
Soft Limit
A soft limit is a governance restriction that, if exceeded, causes a job to yield and reschedule. Unlike hard limits, soft limits do not interrupt function invocations.
SuiteCloud Processors
SuiteCloud Processors handle the execution of Map/Reduce and scheduled scripts by managing jobs and resources.
Summarize Stage
The final stage of a Map/Reduce script, used to log or process summary data from the script’s execution.
Task
A task represents the work of a Map/Reduce script deployment. Each task comprises multiple jobs, with at least one job for each script stage.
Yielding
Yielding is the process by which a job automatically pauses after surpassing a soft limit, rescheduling its remaining work for later. This mechanism ensures efficient resource management without interrupting function invocations.