N P

MLiPS Framework

Author: Mike Corley

Automation Processing:

The application domain has many different frameworks, tools, methods, and classes of algorithms, data handling requirements… - lots of variance. How can a workflow be defined so it's flexible and adaptable to changing requirements? Use interfaces and defer specific (concrete) instances of processing parts to a configuration/policy layer which is managed externally. Define processing work flows as a sequence of stages. Each stage exposes a contextually different processing interface. Achieve flexibility by managing change externally, aided with object factories.

MLiPS Pipeline:

Plugin select user defined processing - each “stage” abstractly defines a step of the work flow by exposing an interface - a contract for processing. Dynamically rebind pipeline stages at runtime. Each stage is built around a custom PowerShell cmdlet. Pipeline: => [ Data extraction | cleaning | formatting | transformation | { prediction or classification } No explicit changes to code! No proliferation of scripts. Maximizes reuse. Use any framework/algorithm as a plugin. Leverage different tools and frameworks to maximize synergy.

Note: PowerShell cmdlets can invoke Java applications like Tika and OpenNLP tools by wrapping them in OS processes, and Java applications can, in-turn, run PowerShell cmdlets. This "interoperability" makes plausable the incorporation of external tools in a PowerShell based framework.