datahub-project · shirshanka · Sep 8, 2022 · Sep 8, 2022
diff --git a/metadata-ingestion/docs/sources/looker/looker.md b/metadata-ingestion/docs/sources/looker/looker.md
diff --git a/metadata-ingestion/docs/sources/looker/looker_pre.md b/metadata-ingestion/docs/sources/looker/looker_pre.md
@@ -0,0 +1,62 @@
+### Pre-Requisites
+
+#### Set up the right permissions
+You need to provide the following permissions for ingestion to work correctly. 
+```
+access_data
+explore
+manage_models
+see_datagroups
+see_lookml
+see_lookml_dashboards
+see_looks
+see_pdts
+see_queries
+see_schedules
+see_sql
+see_system_activity
+see_user_dashboards
+see_users
+```
+Here is an example permission set after configuration. 
+![Looker DataHub Permission Set](./looker_datahub_permission_set.png)
+
+#### Get an API key
+
+You need to get an API key for the account with the above privileges to perform ingestion. See the [Looker authentication docs](https://docs.looker.com/reference/api-and-integration/api-auth#authentication_with_an_sdk) for the steps to create a client ID and secret.
+
+
+### Ingestion through UI
+
+The following video shows you how to get started with ingesting Looker metadata through the UI. 
+
+:::note
+
+You will need to run `lookml` ingestion through the CLI after you have ingested Looker metadata through the UI. Otherwise you will not be able to see Looker Views and their lineage to your warehouse tables. 
+
+::: 
+
+<div
+  style={{
+    position: "relative",
+    paddingBottom: "57.692307692307686%",
+    height: 0
+  }}
+>
+  <iframe
+    src="https://www.loom.com/embed/b8b9654e02714d20a44122cc1bffc1bb"
+    frameBorder={0}
+    webkitallowfullscreen=""
+    mozallowfullscreen=""
+    allowFullScreen=""
+    style={{
+      position: "absolute",
+      top: 0,
+      left: 0,
+      width: "100%",
+      height: "100%"
+    }}
+  />
+</div>
+
+
diff --git a/metadata-ingestion/docs/sources/looker/lookml.md b/metadata-ingestion/docs/sources/looker/lookml.md
diff --git a/metadata-ingestion/docs/sources/looker/lookml_post.md b/metadata-ingestion/docs/sources/looker/lookml_post.md
@@ -0,0 +1,11 @@
+#### Configuration Notes
+
+:::note
+
+The integration can use an SQL parser to try to parse the tables the views depends on. 
+
+:::
+
+This parsing is disabled by default, but can be enabled by setting `parse_table_names_from_sql: True`.  The default parser is based on the [`sqllineage`](https://pypi.org/project/sqllineage/) package.
+As this package doesn't officially support all the SQL dialects that Looker supports, the result might not be correct. You can, however, implement a custom parser and take it into use by setting the `sql_parser` configuration value. A custom SQL parser must inherit from `datahub.utilities.sql_parser.SQLParser`
+and must be made available to Datahub by ,for example, installing it. The configuration then needs to be set to `module_name.ClassName` of the parser.
diff --git a/metadata-ingestion/docs/sources/looker/lookml_pre.md b/metadata-ingestion/docs/sources/looker/lookml_pre.md
@@ -0,0 +1,84 @@
+### Pre-requisites
+
+#### [Optional] Create an API key
+
+See the [Looker authentication docs](https://docs.looker.com/reference/api-and-integration/api-auth#authentication_with_an_sdk) for the steps to create a client ID and secret.
+You need to ensure that the API key is attached to a user that has Admin privileges. 
+
+If that is not possible, read the configuration section and provide an offline specification of the `connection_to_platform_map` and the `project_name`.
+
+### Ingestion through UI
+
+Ingestion using lookml connector is not supported through the UI.
+However, you can set up ingestion using a GitHub Action to push metadata whenever your main lookml repo changes.
+
+#### Sample GitHub Action
+
+Drop this file into your `.github/workflows` directory inside your Looker github repo.
+
+```
+name: lookml metadata upload
+on:
+  push:
+    branches:
+      - main
+    paths-ignore:
+      - "docs/**"
+      - "**.md"
+  pull_request:
+    branches:
+      - main
+    paths-ignore:
+      - "docs/**"
+      - "**.md"
+  release:
+    types: [published, edited]
+  workflow_dispatch:
+
+
+jobs:
+  lookml-metadata-upload:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v2
+      - uses: actions/setup-python@v4
+        with:
+          python-version: '3.9'
+      - name: Run LookML ingestion
+        run: |
+          pip install 'acryl-datahub[lookml,datahub-rest]'
+          cat << EOF > lookml_ingestion.yml
+          # LookML ingestion configuration
+          source:
+            type: "lookml"
+            config:
+              base_folder: ${{ github.workspace }}
+              parse_table_names_from_sql: true
+              github_info:
+                repo: ${{ github.repository }}
+                branch: ${{ github.ref }}
+              # Options
+              #connection_to_platform_map:
+              #  acryl-snow: snowflake
+                  #platform: snowflake
+                  #default_db: DEMO_PIPELINE
+              api:
+                client_id: ${LOOKER_CLIENT_ID}
+                client_secret: ${LOOKER_CLIENT_SECRET}
+                base_url: ${LOOKER_BASE_URL}
+          sink:
+            type: datahub-rest
+            config:
+              server: ${DATAHUB_GMS_HOST}
+              token: ${DATAHUB_TOKEN}
+          EOF
+          datahub ingest -c lookml_ingestion.yml
+        env:
+          DATAHUB_GMS_HOST: ${{ secrets.DATAHUB_GMS_HOST }}
+          DATAHUB_TOKEN: ${{ secrets.DATAHUB_TOKEN }}
+          LOOKER_BASE_URL: https://acryl.cloud.looker.com  # <--- replace with your Looker base URL
+          LOOKER_CLIENT_ID: ${{ secrets.LOOKER_CLIENT_ID }}
+          LOOKER_CLIENT_SECRET: ${{ secrets.LOOKER_CLIENT_SECRET }}
+```
+
+If you want to ingest lookml using the **datahub** cli directly, read on for instructions and configuration details.
diff --git a/metadata-ingestion/docs/sources/looker/lookml_recipe.yml b/metadata-ingestion/docs/sources/looker/lookml_recipe.yml
@@ -31,6 +31,7 @@ source:
 
     # Optional additional github information. Used to add github links on the dataset's entity page.
     github_info:
-       repo: org/repo-name
+      repo: org/repo-name
+# Default sink is datahub-rest and doesn't need to be configured
+# See https://datahubproject.io/docs/metadata-ingestion/sink_docs/datahub for customization options
 
-# sink configs
diff --git a/metadata-ingestion/docs/sources/snowflake/README.md b/metadata-ingestion/docs/sources/snowflake/README.md
@@ -1,4 +1,29 @@
-To get all metadata from Snowflake you need to use two plugins `snowflake` and `snowflake-usage`. Both of them are described in this page. These will require 2 separate recipes.
+Ingesting metadata from Snowflake requires either using the **snowflake-beta** module with just one recipe (recommended) or the two separate modules **snowflake** and **snowflake-usage** (soon to be deprecated) with two separate recipes. 
 
+All three modules are described on this page. 
 
-We encourage you to try out new `snowflake-beta` plugin as alternative to running both `snowflake` and `snowflake-usage` plugins and share feedback. `snowflake-beta` is much faster than `snowflake` for extracting metadata .
+We encourage you to try out the new **snowflake-beta** plugin as alternative to running both **snowflake** and **snowflake-usage** plugins and share feedback. `snowflake-beta` is much faster than `snowflake` for extracting metadata.
+
+## Snowflake Ingestion through the UI
+
+The following video shows you how to ingest Snowflake metadata through the UI.
+
+<div style={{ position: "relative", paddingBottom: "56.25%", height: 0 }}>
+  <iframe
+    src="https://www.loom.com/embed/15d0401caa1c4aa483afef1d351760db"
+    frameBorder={0}
+    webkitallowfullscreen=""
+    mozallowfullscreen=""
+    allowFullScreen=""
+    style={{
+      position: "absolute",
+      top: 0,
+      left: 0,
+      width: "100%",
+      height: "100%"
+    }}
+  />
+</div>
+
+
+Read on if you are interested in ingesting Snowflake metadata using the **datahub** cli, or want to learn about all the configuration parameters that are supported by the connectors.
diff --git a/.../docs/sources/snowflake/snowflake-beta.md → ...s/sources/snowflake/snowflake-beta_pre.md b/.../docs/sources/snowflake/snowflake-beta.md → ...s/sources/snowflake/snowflake-beta_pre.md
diff --git a/metadata-ingestion/docs/sources/snowflake/snowflake-beta_recipe.yml b/metadata-ingestion/docs/sources/snowflake/snowflake-beta_recipe.yml
@@ -1,12 +1,11 @@
 source:
   type: snowflake-beta
   config:
-
     # This option is recommended to be used for the first time to ingest all lineage
     ignore_start_time_lineage: true
     # This is an alternative option to specify the start_time for lineage
     # if you don't want to look back since beginning
-    start_time: '2022-03-01T00:00:00Z'
+    start_time: "2022-03-01T00:00:00Z"
 
     # Coordinates
     account_id: "abc48144"
@@ -35,9 +34,7 @@ source:
       profile_table_level_only: true
     profile_pattern:
       allow:
-        - 'ACCOUNTING_DB.*.*'
-        - 'MARKETING_DB.*.*'
-
-
-sink:
-# sink configs
+        - "ACCOUNTING_DB.*.*"
+        - "MARKETING_DB.*.*"
+# Default sink is datahub-rest and doesn't need to be configured
+# See https://datahubproject.io/docs/metadata-ingestion/sink_docs/datahub for customization options
diff --git a/...docs/sources/snowflake/snowflake-usage.md → .../sources/snowflake/snowflake-usage_pre.md b/...docs/sources/snowflake/snowflake-usage.md → .../sources/snowflake/snowflake-usage_pre.md