This blog is inspired by the following Workout Wednesday challenge (dataset in the link):
2023 Week 07 | Power BI: Non-linear Regression – Workout Wednesday (workout-wednesday.com)
I will be going through only the linear regression part, as I had an hour and a half to learn everything from opening the Deneb tool to creating a layered chart and using transforms (linear regression is a type of transform).
Here is my attempt:
And some useful documentation:
Examples gallery - https://vega.github.io/vega-lite/examples/
Linear regression - https://vega.github.io/vega-lite/examples/layer_point_line_regression.html
Scatter Plot
Deneb can generate a scatter plot automatically, from dragging fields into the "Visualizations" pane and then on the "Edit" view inside the tool.
But dragging a field into Series is mandatory at this stage and I only had two columns, so this was the result:

Not quite what we need, but a decent start.
Looking at the "Specification" tab, this is the code generated for the scatterplot:
{
"data": { "name": "dataset" },
"mark": { "type": "point" },
"encoding": {
"x": {
"field": "Power",
"type": "quantitative"
},
"y": {
"field": "Shot putt distance",
"type": "quantitative"
},
"color": {
"field": "Power",
"type": "nominal",
"scale": {
"scheme": "pbiColorNominal"
}
}
}
}
We want to get rid of that "color" field, in particular the "nominal" type. For example, substituting "nominal" with "quantitative" changes the colour into a continuous palette. However, I did not to it at this stage. When layering multiple visualizations, adding colour works differently.
Layering Visualizations
Adding layers is just a matter of writing (or copy-pasting) the code for each new visualization in the "Specifications" view, as long as they are all wrapped into a "layer" function.
This is the final code for the viz:
{
"data": {"name": "dataset"},
"layer": [
{
"mark": {
"type": "point",
"filled": true,
"color": "teal"
},
"encoding": {
"x": {
"field": "Power",
"type": "quantitative"
},
"y": {
"field": "Shot putt distance",
"type": "quantitative"
}
}
},
{
"mark": {
"type": "line",
"color": "grey"
},
"transform": [
{
"regression": "Shot putt distance",
"on": "Power"
}
],
"encoding": {
"x": {
"field": "Power",
"type": "quantitative"
},
"y": {
"field": "Shot putt distance",
"type": "quantitative"
}
}
},
{
"transform": [
{
"regression": "Shot putt distance",
"on": "Power",
"params": true
},
{"calculate": "'R²: '+format(datum.rSquared, '.2f')", "as": "R2"}
],
"mark": {
"type": "text",
"color": "black",
"fontSize" : 16,
"fontWeight" : "normal",
"x": "width",
"align": "right",
"y": -5
},
"encoding": {
"text": {"type": "nominal", "field": "R2"}
}
}
]
}
