Parameters
Name |
Required |
Default |
Description |
name |
yes |
– |
“data-generation-lr” |
rows |
yes |
– |
number of rows to generate |
cols |
yes |
– |
number of columns to generate |
output |
yes |
– |
output file |
save-mode |
no |
errorifexists |
Options are “errorifexists”, “ignore” (no-op if exists), and “overwrite” |
eps |
no |
2 |
epsilon factor by which examples are scaled |
intercepts |
no |
0.1 |
data intercept |
partitions |
no |
10 |
number of partitions |
Examples
{
name = "data-generation-lr"
rows = 100000100
cols = 24
output = "/tmp/kmeans-data.csv"
}
{
name = "data-generation-lr"
rows = 100000000
cols = 24
output = "/tmp/kmeans-data.parquet"
eps = 4500
intercepts = 1.6
parititions = 10
}