dataclasses

dataclasses is a module introduced in Python 3.7 (available as a backport for older Python versions) that provides a convenient way to define classes for storing data. It aims to reduce boilerplate code by automatically generating common methods and simplifying the process of creating simple, lightweight data containers.

Here are some key features and concepts related to dataclasses:

  1. Decorators: dataclasses utilizes a decorator-based syntax to define data classes. By applying the @dataclass decorator to a class, you can enable the automatic generation of various methods.

  2. Automatic Method Generation: When a class is decorated with @dataclass, the following methods are automatically generated:

    • __init__(): Generates an initializer method with parameters corresponding to the class attributes.
    • __repr__(): Generates a string representation of the class instance, useful for debugging.
    • __eq__(): Generates an equality comparison method to compare two instances for equality.
    • __ne__(): Generates a method to check if two instances are not equal.
    • __lt__(), __le__(), __gt__(), __ge__(): Optional methods for ordering instances.
  3. Type Hints: dataclasses supports type annotations, allowing you to specify the types of class attributes. This provides type hints for static analysis tools and improves code readability.

  4. Inheritance: dataclasses supports inheritance. A data class can inherit from another data class, and the generated methods will take into account the attributes from both the base and derived classes.

  5. Default Values: You can provide default values for attributes by assigning them directly in the class definition. These default values will be used if no value is provided during initialization.

  6. Mutable vs. Immutable: By default, dataclasses creates mutable classes where the attribute values can be modified. However, you can use the frozen=True parameter to create immutable classes, preventing modifications to attribute values after initialization.

Here’s a simple example of a data class definition using dataclasses:

from dataclasses import dataclass

@dataclass
class Person:
    name: str
    age: int
    city: str = "Unknown"

person1 = Person("Alice", 25)
person2 = Person("Bob", 30, "New York")

print(person1)  # Person(name='Alice', age=25, city='Unknown')
print(person1 == person2)  # False

In the above example, the @dataclass decorator is applied to the Person class. The class attributes (name, age, and city) are specified with their types. The generated __init__(), __repr__(), and __eq__() methods allow easy initialization, representation, and comparison of Person instances.

Overall, dataclasses simplifies the creation of lightweight data containers in Python by automatically generating commonly used methods based on class attributes. It reduces the amount of repetitive code and provides a more concise and readable way to define data-oriented classes.

tabulate

tabulate is a Python library that provides an easy way to create formatted tables from tabular data. It is commonly used in command-line interfaces, reports, and other text-based outputs where presenting data in a structured and readable format is required.

Here are some key features and concepts related to tabulate:

  1. Table Formats: tabulate supports various table formats, including "plain", "simple", "grid", "fancy_grid", "pipe", "orgtbl", "jira", "presto", "psql", "rst", "mediawiki", "moinmoin", "html", "latex", and "latex_raw". Each format has a specific visual style for displaying the table.

  2. Data Input: tabulate accepts tabular data in the form of a list of lists, where each inner list represents a row in the table. Alternatively, it can also work with a list of dictionaries, where each dictionary represents a row, and the keys correspond to the column headers.

  3. Column Headers: By default, tabulate uses the first row of the input data as column headers. However, you can explicitly specify the column headers using the headers parameter to override the default behavior.

  4. Alignment: You can specify the alignment of the data in each column using the align parameter. Valid alignment options are "left", "right", "center", and "decimal". The default alignment is "left".

  5. Sorting: tabulate supports sorting the rows based on a specific column. You can specify the column to sort by using the sort parameter.

  6. Cell Formatting: tabulate allows you to format individual cells based on custom formatting rules. You can provide formatting functions or lambdas to format cell values based on their content.

  7. Table Captions: tabulate provides an option to add a caption to the table using the tablefmt parameter. The caption is a string that appears above the table and provides additional context or description.

Here’s a simple example of using tabulate to create a table:

from tabulate import tabulate

data = [
    ["John Doe", 25, "Engineer"],
    ["Jane Smith", 30, "Manager"],
    ["Bob Johnson", 40, "Developer"],
]

table = tabulate(data, headers=["Name", "Age", "Role"], tablefmt="pipe")

print(table)

Output:

| Name        |   Age | Role      |
|:------------|------:|:----------|
| John Doe    |    25 | Engineer  |
| Jane Smith  |    30 | Manager   |
| Bob Johnson |    40 | Developer |

In the above example, a table is created using the tabulate function. The data list contains the rows of the table, and the headers parameter specifies the column headers. The tablefmt parameter is set to "pipe" to generate a table with pipe-separated columns.

Overall, tabulate provides a simple and flexible way to format and present tabular data in Python. It offers various table styles, customization options, and supports multiple output formats, making it a versatile tool for displaying data in a visually appealing and organized manner.

Pillow

Pillow is a powerful and popular Python library for image processing and manipulation. It provides a wide range of functionalities for opening, editing, and saving different image file formats. Here are some key features and concepts related to Pillow:

  1. Image I/O: Pillow supports reading and writing various image file formats, including popular formats such as JPEG, PNG, GIF, BMP, TIFF, and more. It provides functions like Image.open() to open an image file and Image.save() to save an image to a file.

  2. Image Editing: Pillow offers a rich set of image editing capabilities. You can perform various operations on images, such as resizing, cropping, rotating, flipping, and changing the image mode. The library provides functions and methods to modify images, including Image.resize(), Image.crop(), Image.rotate(), Image.transpose(), and more.

  3. Image Filtering and Enhancement: Pillow supports a range of image filtering and enhancement techniques. You can apply filters like blur, sharpen, edge enhance, and smooth to images using functions like ImageFilter.BLUR, ImageFilter.SHARPEN, ImageFilter.EDGE_ENHANCE, and ImageFilter.SMOOTH. Additionally, you can adjust image properties like brightness, contrast, and color balance.

  4. Image Manipulation: Pillow allows you to manipulate pixel data directly. You can access and modify individual pixels or regions of an image using methods like Image.getpixel() and Image.putpixel(). This enables advanced image processing techniques and custom transformations.

  5. Image Transformation: Pillow provides functions for transforming images, including affine transformations such as scaling, shearing, and translating. You can use methods like Image.transform() and ImageOps.shear() to apply these transformations to images.

  6. Image Analysis: Pillow includes functions for basic image analysis tasks. You can extract information about an image, such as its size, format, mode, and color histogram. These functions can be used to obtain statistics or perform image analysis tasks based on image properties.

  7. Image Drawing: Pillow allows you to draw on images using various shapes and text. You can draw lines, rectangles, ellipses, polygons, and text on images using functions like ImageDraw.line(), ImageDraw.rectangle(), ImageDraw.ellipse(), ImageDraw.polygon(), and ImageDraw.text().

  8. Image Conversion: Pillow provides functions for converting images between different color modes, such as RGB, grayscale, and indexed. You can use methods like Image.convert() and ImageOps.grayscale() to change the color mode of an image.

  9. Image Compositing: Pillow supports compositing multiple images together. You can overlay one image on top of another, blend images using different blending modes, and create image masks. Functions like Image.alpha_composite() and Image.blend() enable image compositing operations.

Pillow is widely used in various domains, including computer vision, web development, scientific research, and image processing applications. Its extensive capabilities and user-friendly API make it a popular choice for working with images in Python.

Here’s a simple example of using Pillow to open and resize an image:

from PIL import Image

# Open an image file
image = Image.open("image.jpg")

# Resize the image
resized_image = image.resize((800, 600))

# Save the resized image
resized_image.save("resized_image.jpg")

In the above example, the Image.open() function is used to open an image file. The resize() method is then called on the image object to resize the image to a desired width and height. Finally, the save() method is

yacs

yacs (Yet Another Configuration System) is a lightweight Python library designed to manage configurations in a flexible and hierarchical manner. It provides a simple and intuitive way to define and organize configuration options for projects or applications. yacs is widely used in the computer vision community and serves as the configuration system for popular frameworks like Detectron2 and PyTorch.

Here are some key features and concepts related to yacs:

  1. Hierarchical Configuration: yacs organizes configurations in a hierarchical manner, allowing for easy customization and overriding of settings at different levels. Configurations are defined using nested dictionaries, where each level represents a specific component or module.

  2. Dot Access: yacs supports dot access syntax, allowing easy access to configuration values using dot notation. This provides a convenient way to access nested configuration options without having to manually traverse the dictionary structure.

  3. Default Values: yacs allows you to define default values for configuration options. If a value is not specified at a particular level, yacs falls back to the default value defined in the configuration hierarchy.

  4. Configuration Loading: yacs provides utilities to load configurations from files or dictionaries. You can load configurations from YAML or JSON files using the CfgNode.load_yaml() and CfgNode.load_json() methods, respectively. Additionally, you can load configurations from dictionaries using the CfgNode.load_dict() method.

  5. Configuration Merging: yacs supports merging configurations from multiple sources. You can merge configurations using the CfgNode.merge_from_file() method to merge a configuration loaded from a file with an existing configuration. Similarly, the CfgNode.merge_from_other_cfg() method allows merging configurations from multiple sources.

  6. Validation: yacs enables validation of configuration options against predefined schemas. You can define schema files using the CfgNode.load_cfg() method and validate configurations against these schemas using the CfgNode.assert_cfg() method. This helps ensure that configurations adhere to the expected structure and types.

  7. Readability and Serialization: yacs provides a clean and readable representation of configurations. You can convert configurations to dictionaries using the CfgNode.dump() method, which allows easy serialization and storage of configuration options.

Here’s a simple example demonstrating the usage of yacs:

from yacs.config import CfgNode

# Create a new configuration object
cfg = CfgNode()

# Define configuration options
cfg.MODEL = CfgNode()
cfg.MODEL.NAME = "ResNet"
cfg.MODEL.DEPTH = 50
cfg.MODEL.PRETRAINED = True

# Access configuration options
print(cfg.MODEL.NAME)  # "ResNet"
print(cfg.MODEL.DEPTH)  # 50
print(cfg.MODEL.PRETRAINED)  # True

# Load configurations from a YAML file
cfg.merge_from_file("config.yaml")

# Merge configurations from another source
other_cfg = CfgNode()
other_cfg.MODEL.NAME = "VGG"
cfg.merge_from_other_cfg(other_cfg)

# Access merged configuration options
print(cfg.MODEL.NAME)  # "VGG"

# Validate configuration options
cfg.assert_cfg()

# Serialize configuration to a dictionary
cfg_dict = cfg.dump()

# Print the serialized configuration
print(cfg_dict)

In the above example, a new CfgNode object is created to store configuration options. Configuration options are defined using nested attributes, and dot access is used to access the values. The merge_from_file() method is used to load configurations from a YAML file and merge them with the existing configuration. The assert_cfg() method is called to validate the configuration against a schema, ensuring its correctness. Finally, the dump() method is

used to serialize the configuration to a dictionary.

yacs provides a clean and extensible way to manage configurations, making it easier to organize and customize settings for projects and applications. It promotes a modular and hierarchical approach to configuration management, enabling flexibility and ease of use.

logging

Logging is a built-in module in Python that provides a flexible and configurable way to record messages during the execution of a program. It allows developers to capture and store information about the program’s behavior, errors, warnings, and other relevant events.

The logging module provides several components, including loggers, handlers, formatters, and filters, which can be configured to control the behavior and destination of log messages. Here’s an overview of these components:

  1. Loggers: Loggers are the entry point of the logging system. They are responsible for exposing the logging API to the program. Loggers are organized in a hierarchical structure based on their names, forming a logger hierarchy. Each logger can be assigned a logging level, and it can propagate its messages to its parent logger.

  2. Handlers: Handlers define where log messages are sent. They determine the output destination for the log records. Handlers can be configured to send log messages to the console, files, network sockets, email, or any other desired destination. Multiple handlers can be attached to a logger.

  3. Formatters: Formatters define the layout and structure of the log messages. They specify the format in which the log records are displayed. Formatters can be customized to include timestamps, log levels, module names, and other information in the log messages.

  4. Filters: Filters allow for more fine-grained control over which log records are processed and outputted. They can be used to selectively filter log messages based on their content or other criteria.

The logging module provides several logging methods to record log messages, including debug(), info(), warning(), error(), and critical(), which correspond to different log levels. Log messages can include placeholders for variable data using formatting syntax.

The logging module can be configured using various methods, including configuration files, code-based configuration, or a combination of both. Configuration options include setting the logging level, specifying log file paths, defining log formats, and more.

Overall, the logging module in Python provides a powerful and flexible framework for logging messages, allowing developers to capture and manage log information during the execution of their programs.

logger = logging.getLogger(__name__)

The line logger = logging.getLogger(__name__) creates a logger object with the name __name__. The __name__ variable is a special attribute in Python that holds the name of the current module. By using __name__ as the name of the logger, each module can have its own logger with a unique name.

The logging.getLogger() function retrieves an existing logger with the specified name or creates a new logger if it doesn’t exist. Loggers are organized in a hierarchical structure based on their names, so the name of the logger can be used to indicate its position in the hierarchy.

By creating a logger with the name __name__, you can use this logger to record log messages specific to the current module. This allows you to easily identify the source of log messages and control their behavior independently.

Once the logger object is created, you can use its methods (e.g., logger.debug(), logger.info(), logger.warning(), etc.) to record log messages at different log levels. The logger can also be configured with handlers, formatters, and filters to control where and how the log messages are processed and displayed.

@decorator and @decorator()

In Python, decorators can be applied to functions or classes in two different ways: using @decorator and @decorator().

  1. @decorator:
    When using @decorator without parentheses, it is a shorthand syntax for applying the decorator to the function or class directly. It means that the decorator itself is called with the function or class being decorated as an argument. Here’s an example:

    def decorator(func):
       # Decorator logic
       return func
    
    @decorator
    def my_function():
       # Function implementation

    In this case, the decorator decorator is directly applied to my_function. The function my_function is passed as an argument to decorator, and the return value from the decorator is assigned back to my_function.

  2. @decorator():
    When using @decorator() with parentheses, it means that the decorator itself is called, and the result of that call is used as the actual decorator. This allows additional configuration or customization of the decorator. Here’s an example:

    def decorator(arg):
       def actual_decorator(func):
           # Decorator logic
           return func
       return actual_decorator
    
    @decorator(arg_value)
    def my_function():
       # Function implementation

    In this case, the decorator decorator is called with an argument arg_value, and the returned value is a decorator function actual_decorator. Then, actual_decorator is applied to my_function.

    This pattern is useful when the decorator needs some additional configuration or customization. The outer decorator function (decorator) can accept arguments and return an inner decorator function (actual_decorator) that will be used for the actual decoration.

So, the difference between @decorator and @decorator() lies in whether the decorator is applied directly or if there is an additional level of function call to customize the decorator behavior.

for registry example:

class Registry:
    ...
    def register(self, obj: Any = None) -> Any:
        """
        Register the given object under the the name `obj.__name__`.
        Can be used as either a decorator or not. See docstring of this class for usage.
        """
        if obj is None:
            # used as a decorator
            def deco(func_or_class: Any) -> Any:
                name = func_or_class.__name__
                self._do_register(name, func_or_class)
                return func_or_class

            return deco

        # used as a function call
        name = obj.__name__
        self._do_register(name, obj)

usage:

BACKBONE_REGISTRY = Registry('BACKBONE')
@BACKBONE_REGISTRY.register()
class MyBackbone():
    ...

the decorator @BACKBONE_REGISTRY.register() arg(i.e. obj in register method) is None, so it’s used as a decorator.

nargs=argparse.REMAINDER

No, nargs=argparse.REMAINDER does not require any options to be specified before it. It collects all the remaining command-line arguments into a list, regardless of whether any options were provided before it.

Here’s an example to illustrate this:

import argparse

parser = argparse.ArgumentParser()
parser.add_argument('--option', type=int)
parser.add_argument('positional_args', nargs=argparse.REMAINDER)

args = parser.parse_args()

print(args.option)
print(args.positional_args)

Now, let’s consider different scenarios:

  1. Providing only positional arguments:
python script.py value1 value2 value3

Output:

None
['value1', 'value2', 'value3']

In this case, since no options were specified, the option argument is None, and all positional arguments are collected in the positional_args list.

  1. Providing an option and positional arguments:
python script.py --option 10 value1 value2 value3

Output:

10
['value1', 'value2', 'value3']

Here, the option argument is set to 10, and the remaining positional arguments are collected in the positional_args list.

So, nargs=argparse.REMAINDER allows you to collect all the remaining command-line arguments into a list, whether or not any options were specified before it.

glob

The glob module in Python provides a way to perform pattern matching on file paths. It allows you to find files and directories that match a specified pattern using wildcards and other pattern matching expressions. Here are the key components and functions of the glob module:

  1. Wildcards:

    • * (asterisk): Matches any sequence of characters (including none).
    • ? (question mark): Matches any single character.
  2. Functions:

    • glob.glob(pathname, recursive=False): Returns a list of file paths matching the specified pattern.
      • pathname: The pattern to match against file paths. It can contain wildcards and other pattern matching expressions.
      • recursive (optional): If set to True, the function will recursively search for files in subdirectories as well.
    • glob.iglob(pathname, recursive=False): Returns an iterator over the file paths matching the specified pattern. This function is useful for handling large directories since it returns results one by one, saving memory.
  3. Pattern Matching Expressions:

    • [...]: Matches any single character within the specified set. For example, [abc] matches either ‘a’, ‘b’, or ‘c’.
    • [!...] or [^...]: Matches any single character not in the specified set. For example, [!abc] matches any character except ‘a’, ‘b’, or ‘c’.
    • {...}: Matches any string from a comma-separated list of strings. For example, file.{txt,doc} matches ‘file.txt’ and ‘file.doc’.
    • **: Matches zero or more directories and subdirectories recursively.

Here’s an example that demonstrates the usage of glob.glob:

import glob

# Get a list of all Python files in the current directory
python_files = glob.glob("*.py")
print(python_files)

# Get a list of all text files in a subdirectory recursively
text_files = glob.glob("subdir/**/*.txt", recursive=True)
print(text_files)

In the above example, glob.glob("*.py") returns a list of all Python files in the current directory, while glob.glob("subdir/**/*.txt", recursive=True) returns a list of all text files in the "subdir" directory and its subdirectories recursively.

Note: The glob module follows the rules and conventions of the operating system’s filesystem. The behavior may vary depending on the platform (Windows, macOS, Linux) and the specific file patterns used.

Overall, the code snippet glob.glob(os.path.expanduser(args.input[0])) expands the user’s home directory in the input file path or pattern and then performs pattern matching on the resulting path to retrieve a list of matching file paths.

multiprocessing

The multiprocessing module in Python provides functionality for spawning child processes and executing code in parallel across multiple processors. It is a powerful tool for leveraging the capabilities of multi-core CPUs and distributing workloads across multiple processes.

Key concepts and components of the multiprocessing module include:

  1. Process: The Process class represents an individual child process. You can create an instance of Process and specify a target function or method to be executed in the child process. The child process runs independently from the main program.

  2. Pool: The Pool class provides a convenient way to create a pool of worker processes. You can specify the number of worker processes in the pool, and then use the map() or apply() methods to distribute tasks among the workers.

  3. Queue: The Queue class enables communication and data sharing between processes. You can create a Queue object and use its put() and get() methods to send and receive data between processes. It provides a safe and synchronized way to exchange data.

  4. Lock: The Lock class is used for synchronizing access to shared resources. It allows you to acquire and release locks, ensuring that only one process can access a shared resource at a time. This helps prevent race conditions and maintain data integrity.

  5. Manager: The Manager class provides a way to create shared objects and data structures that can be accessed by multiple processes. It offers various synchronized data types such as lists, dictionaries, and namespaces, allowing multiple processes to interact with shared data.

  6. Synchronization Primitives: The multiprocessing module provides several synchronization primitives such as Semaphore, Event, Condition, and Barrier. These primitives help coordinate the execution of processes and ensure that they wait for specific conditions before proceeding.

  7. Inter-Process Communication (IPC): multiprocessing uses various IPC mechanisms, such as pipes and queues, to enable communication between processes. This allows processes to exchange data and coordinate their activities.

  8. Shared Memory: multiprocessing supports the creation and sharing of shared memory between processes. This allows multiple processes to access and modify the same block of memory, providing a fast and efficient way to share large amounts of data.

When working with multiprocessing, it’s important to consider the following:

  • Serialization: Objects passed between processes need to be serialized and deserialized. This means that objects should be picklable, as the multiprocessing module uses the pickle module for object serialization.

  • Global Variables: Global variables defined in the main program are not shared across processes. Each process has its own separate memory space. If you need to share data between processes, you can use the synchronization and IPC mechanisms provided by multiprocessing.

  • Process Spawning: The multiprocessing module allows you to control how child processes are spawned. The available start methods include "spawn", "fork", and "forkserver". The choice of the start method depends on your platform and specific requirements.

  • Exception Handling: When working with multiple processes, it’s important to handle exceptions properly. Exceptions raised in child processes do not propagate to the parent process by default. You can use the Pool class’s apply_async() method with error handling or manually handle exceptions in child processes.

Overall, the multiprocessing module provides a high-level interface for parallel programming in Python. It simplifies the process of spawning child processes, distributing workloads, and managing communication and synchronization between processes. By leveraging multiple processors and cores, multiprocessing can significantly improve the performance and scalability of your Python programs.

multiprocessing.set_start_method("spawn", force=True)

The multiprocessing module in Python provides support for spawning child processes. When using multiprocessing, you can set the start method to determine how the child processes are created. The set_start_method function is used to set the start method for the multiprocessing module.

In the given code snippet, mp.set_start_method("spawn", force=True) is used to set the start method to "spawn". The "spawn" method creates a new Python interpreter process and uses inter-process communication (IPC) to start child processes. It is a method that works well on most platforms and is the default start method on Windows.

The force=True argument is used to forcefully set the start method, even if the method has already been set before. This can be useful when you want to ensure that a specific start method is used consistently throughout your program.

Here’s an example demonstrating the usage of set_start_method:

import multiprocessing as mp

def worker():
    print("Worker process")

if __name__ == "__main__":
    # Set the start method to "spawn"
    mp.set_start_method("spawn", force=True)

    # Create a new process using the start method
    p = mp.Process(target=worker)
    p.start()
    p.join()

In the above example, the start method is explicitly set to "spawn" using mp.set_start_method("spawn", force=True). Then, a new process is created using mp.Process, and the worker function is executed in the child process.

Setting the start method is important when using multiprocessing, as different platforms may have different default start methods, and some start methods may have specific requirements or limitations. It’s recommended to choose the appropriate start method based on your specific needs and the platform you’re working on.

how to use pdb

pdb (Python Debugger) is a built-in debugger module in Python that allows you to interactively debug your code. Here are some basic steps to use pdb:

  1. Import the pdb module: Start by importing the pdb module at the beginning of your script.

    import pdb
  2. Set a breakpoint: To pause the execution of your code and enter the debugger, place the following line at the location where you want the breakpoint.

    pdb.set_trace()
  3. Run your script: Execute your script as you normally would. When the execution reaches the pdb.set_trace() line, it will pause and enter the debugger.

  4. Interact with the debugger: Once the debugger is active, you can use various commands to inspect variables, step through code, and analyze the program state. Some commonly used commands include:

    • n or next: Execute the next line of code.
    • s or step: Step into the next function call.
    • c or continue: Continue execution until the next breakpoint or the program finishes.
    • l or list: Show the current code context.
    • p <expression>: Print the value of an expression.
    • q or quit: Quit the debugger and terminate the program.

    You can type any of these commands at the debugger’s prompt and press Enter to execute them.

  5. Continue debugging: After executing a command, the debugger will respond with the corresponding output or perform the requested action. You can continue to use commands to navigate through the code, inspect variables, and troubleshoot issues.

  6. Exit the debugger: Once you have finished debugging, you can exit the debugger by either reaching the end of the script or executing the q command to quit.

Here’s an example of how to use pdb:

import pdb

def multiply(a, b):
    result = a * b
    pdb.set_trace()  # Set a breakpoint
    return result

x = 5
y = 3
z = multiply(x, y)
print(z)

When you run this script, it will pause at the pdb.set_trace() line and enter the debugger. From there, you can use commands like n (next), p result (print the value of result), and q (quit) to debug and inspect the code.

It can debug with store_true args.

import argparse
import pdb

parser = argparse.ArgumentParser()
parser.add_argument("--resume", action="store_true", help="whether to attempt to resume from the checkpoint directory")
args = parser.parse_args()

# Set a breakpoint to enter the debugger
pdb.set_trace()

# Access the 'resume' attribute to check its value
print(args.resume)

__all__

__all__= [k for k in globals().keys() if "builtin" not in k and not k.startswith("_")]

The statement __all__ = [k for k in globals().keys() if "builtin" not in k and not k.startswith("_")] is used to define the list of public objects to be exported when using the from module import * syntax in Python.

Here’s how it works:

  • globals() returns a dictionary containing all the global variables and functions in the current module.
  • The list comprehension [k for k in globals().keys() if "builtin" not in k and not k.startswith("_")] iterates over the keys of the MARKDOWN_HASHbd05db852f0fc6652916859b78b1f631MARKDOWNHASH dictionary and filters out the keys that contain the substring "builtin" or start with an underscore ("").
  • The resulting list of keys represents the names of the public objects in the module.
  • Finally, the __all__ variable is assigned this list, indicating which objects should be exported when using the from module import * syntax.

By defining __all__, you can control the visibility of objects when importing a module. Only the objects listed in __all__ will be imported when using the from module import * syntax, and any other objects will not be accessible unless imported explicitly.

It’s worth noting that using the from module import * syntax is generally discouraged in Python, as it can lead to namespace pollution and make the code less readable. It’s considered a better practice to import specific objects explicitly, using the from module import object syntax, or import the module itself and access the objects using the module’s namespace, e.g., module.object.

__name__.endswith(".builtin")

datasets
   |---__init__.py
   |---builtin.py

when execute import builtin, the code is True, for __name__=datasets.builtin.
Compared to running python builtin.py in terminal, __name__=__main__.

打赏作者

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注

CAPTCHA