Logistic Regression
Runs LogisticRegression over the input datasets.
input
in this case is the training dataset. The test dataset is specified by testfile
.
Parameters
Name | Required (y/n) | Default | Description |
---|---|---|---|
name | yes | – | “lr-bml” |
input | yes | – | path to the training dataset |
testfile | yes | – | path to the test dataset |
output | no | – | If users wish to capture the actual results of the workload, they can specify an output file here. |
save-mode | no | errorifexists | Options are “errorifexists”, “ignore” (no-op if exists), and “overwrite” |
numpartitions | no | 32 | number of partitions |
cacheenabled | no | false | whether or not the datasets are cached after being read from disk |
Examples
{
name = "lr-bml"
input = "/tmp/training-data.parquet"
testfile = "/tmp/test-data.parquet"
output = "/tmp/lr-results.csv"
}
{
name = "lr-bml"
input = "/tmp/training-data.parquet"
testfile = "/tmp/test-data.parquet"
output = "/tmp/lr-results.csv"
cacheenabled = true
}