5 Packages

0 views
Skip to first unread message

Fito Coulter

unread,
Aug 3, 2024, 5:57:37 PM8/3/24
to fastcangastco

Software engineers frequently modularize code into libraries. These libraries help programmers operate with leverage: they can spend more time focusing on their unique business logic, and less time implementing code that someone else has already spent the time perfecting.

dbt packages are in fact standalone dbt projects, with models and macros that tackle a specific problem area. As a dbt user, by adding a package to your project, the package's models and macros will become part of your own project. This means:

If your dbt project doesn't require the use of Jinja within the package specifications, you can simply rename your existing packages.yml to dependencies.yml. However, something to note is if your project's package specifications use Jinja, particularly for scenarios like adding an environment variable or a Git token method in a private Git package specification, you should continue using the packages.yml file name.

Currently, to use private git repositories in dbt, you need to use a workaround that involves embedding a git token with Jinja. This is not ideal as it requires extra steps like creating a user and sharing a git token. We're planning to introduce a simpler method soon that won't require Jinja-embedded secret environment variables. For that reason, dependencies.yml does not support Jinja.

dbt Labs hosts the Package hub, registry for dbt packages, as a courtesy to the dbt Community, but does not certify or confirm the integrity, operability, effectiveness, or security of any Packages. Please read the dbt Labs Package Disclaimer before installing Hub packages.

Some package maintainers may wish to push prerelease versions of packages to the dbt Hub, in order to test out new functionality or compatibility with a new version of dbt. A prerelease version is demarcated by a suffix, such as a1 (first alpha), b2 (second beta), or rc3 (third release candidate).

Some organizations have security requirements to pull resources only from internal services. To address the need to install packages from hosted environments such as Artifactory or cloud storage buckets, dbt Core enables you to install packages from internally-hosted tarball URLs.

This method allows the user to clone via HTTPS by passing in a git token via an environment variable. Be careful of the expiration date of any token you use, as an expired token could cause a scheduled run to fail. Additionally, user tokens can create a challenge if the user ever loses access to a specific repo.

If you are using dbt Cloud, you must adhere to the naming conventions for environment variables. Environment variables in dbt Cloud must be prefixed with either DBT_ or . Environment variables keys are uppercased and case sensitive. When referencing env_var('DBT_KEY') in your project's code, the key must match exactly the variable defined in dbt Cloud's UI.

A "local" package is a dbt project accessible from your local file system. You can install it by specifying the project's path. It works best when you nest the project within a subdirectory relative to your current project's directory.

Other patterns may work in some cases, but not always. For example, if you install this project as a package elsewhere, or try running it on a different system, the relative and absolute paths will yield the same results.

When you update a version or revision in your packages.yml file, it isn't automatically updated in your dbt project. You should run dbt deps to update the package. You may also need to run a full refresh of the models in this package.

When you remove a package from your packages.yml file, it isn't automatically deleted from your dbt project, as it still exists in your dbt_packages/ directory. If you want to completely uninstall a package, you should either:

In dbt v0.17.0 only, if the package version you want is only specified as major.minor, as opposed to major.minor.patch, you may get an error that 1.0 is not of type 'string'. In that case you will have to tell dbt that your version number is a string. This issue was resolved in v0.17.1 and all subsequent versions.

MyBinder.org service is actually a modified JupyterHub. Perhaps gong through the steps of installing R and ggplot to work in Jupyter Notebooks there would be helpful for you to better understand what is involved? You say you are rather new to coding. You can see in sessions launched from here that ggplot2 works in a Jupyter notebook backed by an R kernel? (When the session opens, choose the R kernel to open a notebook and try running the code library("ggplot2") to see it runs without an error.) You could see if you can get ggplot2 to work from your own repo or gist where built more simply as a learning endeavor with the hope you can apply some of what your learn to wherever you are struggling at present.

Are you attempting to install these packages from a notebook?
If so, you would you mind going to the terminal, enter R and then attempt to install the packages one by one and if any error comes up paste it here?

To support this, Python has a way to put definitions in a file and use them in ascript or in an interactive instance of the interpreter. Such a file is called amodule; definitions from a module can be imported into other modules or intothe main module (the collection of variables that you have access to in ascript executed at the top level and in calculator mode).

This does not add the names of the functions defined in fibo directly tothe current namespace (see Python Scopes and Namespaces for more details);it only adds the module name fibo there. Usingthe module name you can access the functions:

A module can contain executable statements as well as function definitions.These statements are intended to initialize the module. They are executed onlythe first time the module name is encountered in an import statement. [1](They are also run if the file is executed as a script.)

This imports all names except those beginning with an underscore (_).In most cases Python programmers do not use this facility since it introducesan unknown set of names into the interpreter, possibly hiding some thingsyou have already defined.

Note that in general the practice of importing * from a module or package isfrowned upon, since it often causes poorly readable code. However, it is okay touse it to save typing in interactive sessions.

On file systems which support symlinks, the directory containing the inputscript is calculated after the symlink is followed. In other words thedirectory containing the symlink is not added to the module search path.

After initialization, Python programs can modify sys.path. Thedirectory containing the script being run is placed at the beginning of thesearch path, ahead of the standard library path. This means that scripts in thatdirectory will be loaded instead of modules of the same name in the librarydirectory. This is an error unless the replacement is intended. See sectionStandard Modules for more information.

To speed up loading modules, Python caches the compiled version of each modulein the __pycache__ directory under the name module.version.pyc,where the version encodes the format of the compiled file; it generally containsthe Python version number. For example, in CPython release 3.3 the compiledversion of spam.py would be cached as __pycache__/spam.cpython-33.pyc. Thisnaming convention allows compiled modules from different releases and differentversions of Python to coexist.

The __init__.py files are required to make Python treat directoriescontaining the file as packages (unless using a namespace package, arelatively advanced feature). This prevents directories with a common name,such as string, from unintentionally hiding valid modules that occur lateron the module search path. In the simplest case, __init__.py can just bean empty file, but it can also execute initialization code for the package orset the __all__ variable, described later.

Note that when using from package import item, the item can be either asubmodule (or subpackage) of the package, or some other name defined in thepackage, like a function, class or variable. The import statement firsttests whether the item is defined in the package; if not, it assumes it is amodule and attempts to load it. If it fails to find it, an ImportErrorexception is raised.

Be aware that submodules might become shadowed by locally defined names. Forexample, if you added a reverse function to thesound/effects/__init__.py file, the from sound.effects import *would only import the two submodules echo and surround, but not thereverse submodule, because it is shadowed by the locally definedreverse function:

If __all__ is not defined, the statement from sound.effects import *does not import all submodules from the package sound.effects into thecurrent namespace; it only ensures that the package sound.effects hasbeen imported (possibly running any initialization code in __init__.py)and then imports whatever names are defined in the package. This includes anynames defined (and submodules explicitly loaded) by __init__.py. Italso includes any submodules of the package that were explicitly loaded byprevious import statements. Consider this code:

In this example, the echo and surround modules are imported in thecurrent namespace because they are defined in the sound.effects packagewhen the from...import statement is executed. (This also works when__all__ is defined.)

Remember, there is nothing wrong with using from package importspecific_submodule! In fact, this is the recommended notation unless theimporting module needs to use submodules with the same name from differentpackages.

When packages are structured into subpackages (as with the sound packagein the example), you can use absolute imports to refer to submodules of siblingspackages. For example, if the module sound.filters.vocoder needs to usethe echo module in the sound.effects package, it can use fromsound.effects import echo.

You can also write relative imports, with the from module import name formof import statement. These imports use leading dots to indicate the current andparent packages involved in the relative import. From the surroundmodule for example, you might use:

c80f0f1006
Reply all
Reply to author
Forward
0 new messages