Parameters

Name Required Default Description
name yes “data-generation-lr”
rows yes number of rows to generate
cols yes number of columns to generate
output yes output file
save-mode no errorifexists Options are “errorifexists”, “ignore” (no-op if exists), and “overwrite”
eps no 2 epsilon factor by which examples are scaled
intercepts no 0.1 data intercept
partitions no 10 number of partitions

Examples

{
  name = "data-generation-lr"
  rows = 100000100
  cols = 24
  output = "/tmp/kmeans-data.csv"
}
{
  name = "data-generation-lr"
  rows = 100000000
  cols = 24
  output = "/tmp/kmeans-data.parquet"
  eps = 4500
  intercepts = 1.6
  parititions = 10
}