Skip to content

Latest commit

 

History

History
17 lines (11 loc) · 1.26 KB

CONTRIBUTING.md

File metadata and controls

17 lines (11 loc) · 1.26 KB

Contributing to Stuart ML

Guidelines

  • Busted-based TDD
  • Class modules begin with an uppercase letter, and end up in their own file that begins with an uppercase letter (e.g. RDD.lua)
  • Two spaces for indents.
  • The _ global variable is the unused variable stand-in.
  • Alphabetized functions within a module.

Strict Rules for Encapsulation

In Apache Spark source codes, some RDD-based ML model source codes such as KMeans.scala have introduced dependencies on DataFrame based modules; i.e. code sharing within a monolithic application where individual jar files are not actually usable without all the others. This is unacceptable in an edge environment where end users will depend heavily on the Lua Amalgamator's or Transpiler's ability to strip out unused modules.

For example, if an end user's Spark job uses RDD-based ML models, and never references DataFrames, then the underlying RDD-based models MUST not reference them or Stuart SQL. In such cases, there is a need to forensically comb through the commits that introduced the unwanted dependency, and unwind it by reconstructing and then duplicating the required code.