Self-Service Power BI with Shared Datasets
Leveraging the shared datasets feature from Power BI allows your company’s analysts to work from a single source of truth.
Data silos are something that every large organization struggles with. Even if you have a centralized data warehouse, various analyst groups could potentially query the data in different ways and end up with different answers to the same question.
With a Power BI dataset, analysts can quickly connect to a pre-defined table structure that has been modeled together and includes universal calculations. This ensures that everyone is speaking the same language and that there is little room for misinterpretation.
The initial setup of this dataset should be completed by a BI or IT team very familiar with the data source (ideally a data warehouse), that is also aware of the metrics used by the business. The BI team can start by connecting to the data source within Desktop and including all of the necessary tables.
Once imported, those tables can be appropriately modeled together. Finally, DAX measures can be created to ensure that there is only one way to calculate a metric (ex. Revenue) for the end users. Typically, after doing these steps, it would be time to create the visualizations on the report canvas – but since we’re leaving that part up to the analysts, the canvas can be left blank.
After the report/dataset has been created, publish it to a workspace where several analysts can access it. Note: This workspace has to be part of the New Workspace Experience within the Power BI Service. Anyone with access to this workspace will have the ability to connect to the newly created dataset. If you’d like to make this dataset available to the entire organization, there is the ability to make it a Promoted or Certified dataset.
The report creator will then begin a report within Desktop, but instead of connecting to a database using Import or DirectQuery, they will connect directly to the Power BI Dataset via Live Connection. They can create a report using the given fields and calculations, then publish that report to their own workspace.
What are the advantages of using this method?
- Single source of truth for data and calculations
- Ability to limit database access
- This allows users to build reports using the data warehouse, without actually having credentials to the DW
- Changes to the original dataset will propagate to every downstream report
- Need to add a new column or tweak a calculation? Next time the analyst opens the report, these changes will automatically be passed down to them
- Report creators can access a dataset from Workspace A, create a report, and deploy that report within Workspace B
- Report creators don’t need to worry about gateways and scheduled refreshes
- The initial dataset can be set up with a scheduled refresh which will propagate down to end users
Potential Limitation
In my view, there is really only one limitation to this method – but for some organizations, it’s potentially a big one. When you create a report using a Power BI dataset, you are no longer able to connect an outside source to your report. So, if the original data warehouse takes care of 95% of your needs, but you need to supplement an Excel file to the report’s dataset, you’re out of luck. Other features are disabled as well, including the ability to add calculated columns or calculated tables.
The good news about this limitation is that Microsoft is working to allow outside data sources to be incorporated with the pre-defined dataset. It’s currently part of the Wave 2 release, scheduled for the end of 2020.
Incorporating these centralized datasets allows your organization to have streamlined, sustainable, and accurate self-service BI.