-
Notifications
You must be signed in to change notification settings - Fork 139
Open
Description
I am loading model like this
import sparknlp
import nlu
spark = sparknlp.start()
df = spark.read.csv("nlp_data.csv")
res = nlu.load("pos").predict(df[["text"]].rdd.flatMap(lambda x: x).collect())
print(res)
spark.stop()
Each time I get the following messages in my console:
com.johnsnowlabs.nlp#spark-nlp_2.12 added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent-3f17e4b8-0bdf-40c5-9879-d62f9c2dc974;1.0
confs: [default]
found com.johnsnowlabs.nlp#spark-nlp_2.12;5.2.3 in central
found com.typesafe#config;1.4.2 in central
found org.rocksdb#rocksdbjni;6.29.5 in central
found com.amazonaws#aws-java-sdk-s3;1.12.500 in central
found com.amazonaws#aws-java-sdk-kms;1.12.500 in central
found com.amazonaws#aws-java-sdk-core;1.12.500 in central
found commons-logging#commons-logging;1.1.3 in central
found commons-codec#commons-codec;1.15 in central
found org.apache.httpcomponents#httpclient;4.5.13 in central
found org.apache.httpcomponents#httpcore;4.4.13 in central
found software.amazon.ion#ion-java;1.0.2 in central
found com.fasterxml.jackson.dataformat#jackson-dataformat-cbor;2.12.6 in central
found joda-time#joda-time;2.8.1 in central
found com.amazonaws#jmespath-java;1.12.500 in central
found com.github.universal-automata#liblevenshtein;3.0.0 in central
found com.google.protobuf#protobuf-java-util;3.0.0-beta-3 in central
found com.google.protobuf#protobuf-java;3.0.0-beta-3 in central
found com.google.code.gson#gson;2.3 in central
found it.unimi.dsi#fastutil;7.0.12 in central
found org.projectlombok#lombok;1.16.8 in central
found com.google.cloud#google-cloud-storage;2.20.1 in central
found com.google.guava#guava;31.1-jre in central
found com.google.guava#failureaccess;1.0.1 in central
found com.google.guava#listenablefuture;9999.0-empty-to-avoid-conflict-with-guava in central
found com.google.errorprone#error_prone_annotations;2.18.0 in central
found com.google.j2objc#j2objc-annotations;1.3 in central
found com.google.http-client#google-http-client;1.43.0 in central
found io.opencensus#opencensus-contrib-http-util;0.31.1 in central
found com.google.http-client#google-http-client-jackson2;1.43.0 in central
found com.google.http-client#google-http-client-gson;1.43.0 in central
found com.google.api-client#google-api-client;2.2.0 in central
found com.google.oauth-client#google-oauth-client;1.34.1 in central
found com.google.http-client#google-http-client-apache-v2;1.43.0 in central
found com.google.apis#google-api-services-storage;v1-rev20220705-2.0.0 in central
found com.google.code.gson#gson;2.10.1 in central
found com.google.cloud#google-cloud-core;2.12.0 in central
found io.grpc#grpc-context;1.53.0 in central
found com.google.auto.value#auto-value-annotations;1.10.1 in central
found com.google.auto.value#auto-value;1.10.1 in central
found javax.annotation#javax.annotation-api;1.3.2 in central
found com.google.cloud#google-cloud-core-http;2.12.0 in central
found com.google.http-client#google-http-client-appengine;1.43.0 in central
found com.google.api#gax-httpjson;0.108.2 in central
found com.google.cloud#google-cloud-core-grpc;2.12.0 in central
found io.grpc#grpc-alts;1.53.0 in central
found io.grpc#grpc-grpclb;1.53.0 in central
found org.conscrypt#conscrypt-openjdk-uber;2.5.2 in central
found io.grpc#grpc-auth;1.53.0 in central
found io.grpc#grpc-protobuf;1.53.0 in central
found io.grpc#grpc-protobuf-lite;1.53.0 in central
found io.grpc#grpc-core;1.53.0 in central
found com.google.api#gax;2.23.2 in central
found com.google.api#gax-grpc;2.23.2 in central
found com.google.auth#google-auth-library-credentials;1.16.0 in central
found com.google.auth#google-auth-library-oauth2-http;1.16.0 in central
found com.google.api#api-common;2.6.2 in central
found io.opencensus#opencensus-api;0.31.1 in central
found com.google.api.grpc#proto-google-iam-v1;1.9.2 in central
found com.google.protobuf#protobuf-java;3.21.12 in central
found com.google.protobuf#protobuf-java-util;3.21.12 in central
found com.google.api.grpc#proto-google-common-protos;2.14.2 in central
found org.threeten#threetenbp;1.6.5 in central
found com.google.api.grpc#proto-google-cloud-storage-v2;2.20.1-alpha in central
found com.google.api.grpc#grpc-google-cloud-storage-v2;2.20.1-alpha in central
found com.google.api.grpc#gapic-google-cloud-storage-v2;2.20.1-alpha in central
found com.fasterxml.jackson.core#jackson-core;2.14.2 in central
found com.google.code.findbugs#jsr305;3.0.2 in central
found io.grpc#grpc-api;1.53.0 in central
found io.grpc#grpc-stub;1.53.0 in central
found org.checkerframework#checker-qual;3.31.0 in central
found io.perfmark#perfmark-api;0.26.0 in central
found com.google.android#annotations;4.1.1.4 in central
found org.codehaus.mojo#animal-sniffer-annotations;1.22 in central
found io.opencensus#opencensus-proto;0.2.0 in central
found io.grpc#grpc-services;1.53.0 in central
found com.google.re2j#re2j;1.6 in central
found io.grpc#grpc-netty-shaded;1.53.0 in central
found io.grpc#grpc-googleapis;1.53.0 in central
found io.grpc#grpc-xds;1.53.0 in central
found com.navigamez#greex;1.0 in central
found dk.brics.automaton#automaton;1.11-8 in central
found com.johnsnowlabs.nlp#tensorflow-cpu_2.12;0.4.4 in central
found com.microsoft.onnxruntime#onnxruntime;1.16.3 in central
:: resolution report :: resolve 1966ms :: artifacts dl 54ms
:: modules in use:
com.amazonaws#aws-java-sdk-core;1.12.500 from central in [default]
com.amazonaws#aws-java-sdk-kms;1.12.500 from central in [default]
com.amazonaws#aws-java-sdk-s3;1.12.500 from central in [default]
com.amazonaws#jmespath-java;1.12.500 from central in [default]
com.fasterxml.jackson.core#jackson-core;2.14.2 from central in [default]
com.fasterxml.jackson.dataformat#jackson-dataformat-cbor;2.12.6 from central in [default]
com.github.universal-automata#liblevenshtein;3.0.0 from central in [default]
com.google.android#annotations;4.1.1.4 from central in [default]
com.google.api#api-common;2.6.2 from central in [default]
com.google.api#gax;2.23.2 from central in [default]
com.google.api#gax-grpc;2.23.2 from central in [default]
com.google.api#gax-httpjson;0.108.2 from central in [default]
com.google.api-client#google-api-client;2.2.0 from central in [default]
com.google.api.grpc#gapic-google-cloud-storage-v2;2.20.1-alpha from central in [default]
com.google.api.grpc#grpc-google-cloud-storage-v2;2.20.1-alpha from central in [default]
com.google.api.grpc#proto-google-cloud-storage-v2;2.20.1-alpha from central in [default]
com.google.api.grpc#proto-google-common-protos;2.14.2 from central in [default]
com.google.api.grpc#proto-google-iam-v1;1.9.2 from central in [default]
com.google.apis#google-api-services-storage;v1-rev20220705-2.0.0 from central in [default]
com.google.auth#google-auth-library-credentials;1.16.0 from central in [default]
com.google.auth#google-auth-library-oauth2-http;1.16.0 from central in [default]
com.google.auto.value#auto-value;1.10.1 from central in [default]
com.google.auto.value#auto-value-annotations;1.10.1 from central in [default]
com.google.cloud#google-cloud-core;2.12.0 from central in [default]
com.google.cloud#google-cloud-core-grpc;2.12.0 from central in [default]
com.google.cloud#google-cloud-core-http;2.12.0 from central in [default]
com.google.cloud#google-cloud-storage;2.20.1 from central in [default]
com.google.code.findbugs#jsr305;3.0.2 from central in [default]
com.google.code.gson#gson;2.10.1 from central in [default]
com.google.errorprone#error_prone_annotations;2.18.0 from central in [default]
com.google.guava#failureaccess;1.0.1 from central in [default]
com.google.guava#guava;31.1-jre from central in [default]
com.google.guava#listenablefuture;9999.0-empty-to-avoid-conflict-with-guava from central in [default]
com.google.http-client#google-http-client;1.43.0 from central in [default]
com.google.http-client#google-http-client-apache-v2;1.43.0 from central in [default]
com.google.http-client#google-http-client-appengine;1.43.0 from central in [default]
com.google.http-client#google-http-client-gson;1.43.0 from central in [default]
com.google.http-client#google-http-client-jackson2;1.43.0 from central in [default]
com.google.j2objc#j2objc-annotations;1.3 from central in [default]
com.google.oauth-client#google-oauth-client;1.34.1 from central in [default]
com.google.protobuf#protobuf-java;3.21.12 from central in [default]
com.google.protobuf#protobuf-java-util;3.21.12 from central in [default]
com.google.re2j#re2j;1.6 from central in [default]
com.johnsnowlabs.nlp#spark-nlp_2.12;5.2.3 from central in [default]
com.johnsnowlabs.nlp#tensorflow-cpu_2.12;0.4.4 from central in [default]
com.microsoft.onnxruntime#onnxruntime;1.16.3 from central in [default]
com.navigamez#greex;1.0 from central in [default]
com.typesafe#config;1.4.2 from central in [default]
commons-codec#commons-codec;1.15 from central in [default]
commons-logging#commons-logging;1.1.3 from central in [default]
dk.brics.automaton#automaton;1.11-8 from central in [default]
io.grpc#grpc-alts;1.53.0 from central in [default]
io.grpc#grpc-api;1.53.0 from central in [default]
io.grpc#grpc-auth;1.53.0 from central in [default]
io.grpc#grpc-context;1.53.0 from central in [default]
io.grpc#grpc-core;1.53.0 from central in [default]
io.grpc#grpc-googleapis;1.53.0 from central in [default]
io.grpc#grpc-grpclb;1.53.0 from central in [default]
io.grpc#grpc-netty-shaded;1.53.0 from central in [default]
io.grpc#grpc-protobuf;1.53.0 from central in [default]
io.grpc#grpc-protobuf-lite;1.53.0 from central in [default]
io.grpc#grpc-services;1.53.0 from central in [default]
io.grpc#grpc-stub;1.53.0 from central in [default]
io.grpc#grpc-xds;1.53.0 from central in [default]
io.opencensus#opencensus-api;0.31.1 from central in [default]
io.opencensus#opencensus-contrib-http-util;0.31.1 from central in [default]
io.opencensus#opencensus-proto;0.2.0 from central in [default]
io.perfmark#perfmark-api;0.26.0 from central in [default]
it.unimi.dsi#fastutil;7.0.12 from central in [default]
javax.annotation#javax.annotation-api;1.3.2 from central in [default]
joda-time#joda-time;2.8.1 from central in [default]
org.apache.httpcomponents#httpclient;4.5.13 from central in [default]
org.apache.httpcomponents#httpcore;4.4.13 from central in [default]
org.checkerframework#checker-qual;3.31.0 from central in [default]
org.codehaus.mojo#animal-sniffer-annotations;1.22 from central in [default]
org.conscrypt#conscrypt-openjdk-uber;2.5.2 from central in [default]
org.projectlombok#lombok;1.16.8 from central in [default]
org.rocksdb#rocksdbjni;6.29.5 from central in [default]
org.threeten#threetenbp;1.6.5 from central in [default]
software.amazon.ion#ion-java;1.0.2 from central in [default]
:: evicted modules:
commons-logging#commons-logging;1.2 by [commons-logging#commons-logging;1.1.3] in [default]
commons-codec#commons-codec;1.11 by [commons-codec#commons-codec;1.15] in [default]
com.google.protobuf#protobuf-java-util;3.0.0-beta-3 by [com.google.protobuf#protobuf-java-util;3.21.12] in [default]
com.google.protobuf#protobuf-java;3.0.0-beta-3 by [com.google.protobuf#protobuf-java;3.21.12] in [default]
com.google.code.gson#gson;2.3 by [com.google.code.gson#gson;2.10.1] in [default]
---------------------------------------------------------------------
| | modules || artifacts |
| conf | number| search|dwnlded|evicted|| number|dwnlded|
---------------------------------------------------------------------
| default | 85 | 0 | 0 | 5 || 80 | 0 |
---------------------------------------------------------------------
:: retrieving :: org.apache.spark#spark-submit-parent-3f17e4b8-0bdf-40c5-9879-d62f9c2dc974
confs: [default]
0 artifacts copied, 80 already retrieved (0kB/27ms)
pos_anc download started this may take some time.
Approximate size to download 3.9 MB
[ / ]pos_anc download started this may take some time.
Approximate size to download 3.9 MB
[ — ]Download done! Loading the resource.
[OK!]
sentence_detector_dl download started this may take some time.
Approximate size to download 354.6 KB
[ | ]sentence_detector_dl download started this may take some time.
Approximate size to download 354.6 KB
[ / ]Download done! Loading the resource.
[ — ]2024-02-06 14:43:45.340048: I external/org_tensorflow/tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
[OK!]
Is it indicating that I am downloading the model(s) from the internet agin and again, or am I downloading it from the jar files?
I assume that the jar files are now on my local system since it took some time when I first installed spark-nlp, and now it just prints the jars information almost immediately when I run the code
Metadata
Metadata
Assignees
Labels
No labels