Models API¶
API reference for object detection models.
Model Registry¶
The global registry for all detection models. Use this to build models by name.
- objdet.models.registry.MODEL_REGISTRY¶
Generic registry for plugin-style component management.
This class provides a centralized registry where components can be registered by name and later retrieved. It supports both decorator-style and direct registration.
- Parameters:
name – Name of this registry (for logging/error messages).
- objdet.models.registry.name¶
Registry name.
- objdet.models.registry._registry¶
Internal dictionary mapping names to registered items.
Example
>>> registry = Registry[nn.Module]("models") >>> registry.register("my_model")(MyModelClass) >>> model_cls = registry.get("my_model")
Registered Models¶
Name |
Aliases |
Class |
|---|---|---|
|
|
|
|
- |
|
|
|
|
|
|
|
from objdet.models import MODEL_REGISTRY
# Build model from registry
model = MODEL_REGISTRY.build("faster_rcnn", num_classes=80)
Base Class¶
BaseLightningDetector¶
- class objdet.models.base.BaseLightningDetector(num_classes, class_index_mode=ClassIndexMode.TORCHVISION, learning_rate=0.001, weight_decay=0.0001, confidence_threshold=0.25, nms_threshold=0.45, pretrained=True, pretrained_backbone=True, optimizer='adamw', scheduler='cosine', scheduler_kwargs=None)[source]¶
Bases:
LightningModuleAbstract base class for object detection models.
This class provides common functionality for all detection models: - Standard training/validation/test step implementations - Metric computation (mAP) - Optimizer and scheduler configuration - Logging integration
Subclasses must implement: - forward(): Model forward pass - _build_model(): Model architecture construction
- Parameters:
num_classes (
int) – Number of object classes to detect (excluding background for TorchVision models, including all classes for YOLO).class_index_mode (
ClassIndexMode|str) – How class indices are handled. TORCHVISION expects background at index 0, YOLO has no background class.learning_rate (
float) – Initial learning rate for optimizer.weight_decay (
float) – Weight decay for optimizer.confidence_threshold (
float) – Minimum confidence for predictions.nms_threshold (
float) – IoU threshold for NMS.pretrained (
bool) – Whether to use pretrained weights.pretrained_backbone (
bool) – Whether to use pretrained backbone only.
- num_classes¶
Number of detection classes.
- class_index_mode¶
Class index handling mode.
- hparams¶
Hyperparameters (auto-saved by Lightning).
Example
>>> model = MyDetector(num_classes=80, pretrained=True) >>> trainer = L.Trainer(max_epochs=100) >>> trainer.fit(model, datamodule)
- abstractmethod forward(images, targets=None)[source]¶
Forward pass of the model.
- Parameters:
- Returns:
Dictionary of losses. During inference (no targets): List of prediction dicts.
- Return type:
During training (targets provided)
- configure_optimizers()[source]¶
Configure optimizer and learning rate scheduler.
- Return type:
OptimizerLRSchedulerConfig- Returns:
Dictionary with optimizer and optional lr_scheduler configuration.
TorchVision Models¶
FasterRCNN¶
Two-stage detector with Region Proposal Network.
- class objdet.models.torchvision.faster_rcnn.FasterRCNN(num_classes, backbone='resnet50_fpn_v2', pretrained=False, pretrained_backbone=True, trainable_backbone_layers=3, min_size=800, max_size=1333, **kwargs)[source]¶
Bases:
BaseLightningDetectorFaster R-CNN with ResNet-50 FPN backbone.
This is a two-stage object detector consisting of: 1. Region Proposal Network (RPN) for generating object proposals 2. Fast R-CNN head for classification and bounding box regression
The model uses TorchVision class indexing (background at index 0).
- Parameters:
num_classes (
int) – Number of object classes (NOT including background). The model will internally use num_classes + 1.backbone (
str) – Backbone variant - “resnet50_fpn” or “resnet50_fpn_v2”.pretrained (
bool) – If True, use pretrained weights on COCO.pretrained_backbone (
bool) – If True, use ImageNet pretrained backbone.trainable_backbone_layers (
int) – Number of trainable backbone layers (0-5).min_size (
int) – Minimum image size for inference.max_size (
int) – Maximum image size for inference.**kwargs (
Any) – Additional arguments for BaseLightningDetector.
- model¶
The underlying TorchVision Faster R-CNN model.
Example
>>> model = FasterRCNN(num_classes=20, pretrained_backbone=True) >>> images = [torch.rand(3, 800, 600) for _ in range(4)] >>> predictions = model(images)
from objdet.models.torchvision import FasterRCNN
from lightning import Trainer
model = FasterRCNN(
num_classes=80,
backbone="resnet50_fpn_v2",
pretrained_backbone=True,
trainable_backbone_layers=3,
)
trainer = Trainer(max_epochs=100)
trainer.fit(model, datamodule)
RetinaNet¶
One-stage detector with focal loss.
- class objdet.models.torchvision.retinanet.RetinaNet(num_classes, backbone='resnet50_fpn_v2', pretrained=False, pretrained_backbone=True, trainable_backbone_layers=3, min_size=800, max_size=1333, score_thresh=0.05, nms_thresh=0.5, detections_per_img=300, **kwargs)[source]¶
Bases:
BaseLightningDetectorRetinaNet with ResNet-50 FPN backbone.
RetinaNet is a one-stage object detector that uses: 1. Feature Pyramid Network (FPN) for multi-scale features 2. Focal loss to address class imbalance 3. Separate classification and regression heads
The model uses TorchVision class indexing (background at index 0).
- Parameters:
num_classes (
int) – Number of object classes (NOT including background).backbone (
str) – Backbone variant - “resnet50_fpn” or “resnet50_fpn_v2”.pretrained (
bool) – If True, use pretrained weights on COCO.pretrained_backbone (
bool) – If True, use ImageNet pretrained backbone.trainable_backbone_layers (
int) – Number of trainable backbone layers (0-5).min_size (
int) – Minimum image size for inference.max_size (
int) – Maximum image size for inference.score_thresh (
float) – Score threshold for predictions.nms_thresh (
float) – NMS threshold.detections_per_img (
int) – Maximum detections per image.**kwargs (
Any) – Additional arguments for BaseLightningDetector.
Example
>>> model = RetinaNet(num_classes=20, pretrained_backbone=True) >>> trainer = Trainer(max_epochs=50) >>> trainer.fit(model, datamodule)
from objdet.models.torchvision import RetinaNet
model = RetinaNet(
num_classes=80,
backbone="resnet50_fpn_v2",
pretrained_backbone=True,
score_thresh=0.05,
nms_thresh=0.5,
)
YOLO Models¶
YOLOv8¶
- class objdet.models.yolo.yolov8.YOLOv8(num_classes, model_size='n', pretrained=True, conf_thres=0.25, iou_thres=0.45, **kwargs)[source]¶
Bases:
YOLOBaseLightningYOLOv8 object detection model wrapped for Lightning.
YOLOv8 is a state-of-the-art real-time object detector featuring: - Anchor-free detection head - C2f modules for efficient feature extraction - Mosaic and MixUp augmentation (handled via transforms) - Task-aligned assigner for positive sample selection
Available model sizes: - n (nano): Fastest, lowest accuracy (~3.2M params) - s (small): Fast with good accuracy (~11.2M params) - m (medium): Balanced speed/accuracy (~25.9M params) - l (large): High accuracy (~43.7M params) - x (extra-large): Highest accuracy (~68.2M params)
Warning
There is a known bug in the training pipeline that causes
IndexError: too many indices for tensor of dimension 2during the loss computation. This affects training via both CLI and Python API. Investigation is ongoing to resolve this issue in the Ultralytics loss integration.- Parameters:
num_classes (
int) – Number of object classes (no background).model_size (
str) – Model size variant (“n”, “s”, “m”, “l”, “x”).pretrained (
bool) – If True, load COCO pretrained weights.conf_thres (
float) – Confidence threshold for predictions.iou_thres (
float) – IoU threshold for NMS.**kwargs (
Any) – Additional arguments for BaseLightningDetector.
Example
>>> # Create YOLOv8-medium model >>> model = YOLOv8(num_classes=20, model_size="m") >>> >>> # Train with Lightning >>> trainer = Trainer( ... max_epochs=100, ... callbacks=[ModelCheckpoint(monitor="val/mAP")], ... ) >>> trainer.fit(model, datamodule)
- MODEL_VARIANTS = {'l': 'yolov8l.pt', 'm': 'yolov8m.pt', 'n': 'yolov8n.pt', 's': 'yolov8s.pt', 'x': 'yolov8x.pt'}¶
Model Sizes:
Size |
Variant |
Parameters |
|---|---|---|
n (nano) |
|
~3.2M |
s (small) |
|
~11.2M |
m (medium) |
|
~25.9M |
l (large) |
|
~43.7M |
x (extra-large) |
|
~68.2M |
from objdet.models.yolo import YOLOv8
model = YOLOv8(
num_classes=80,
model_size="m",
pretrained=True,
conf_thres=0.25,
iou_thres=0.45,
)
Warning
There is a known bug in the training pipeline that causes
IndexError: too many indices for tensor of dimension 2 during loss computation.
YOLOv11¶
- class objdet.models.yolo.yolov11.YOLOv11(num_classes, model_size='n', pretrained=True, conf_thres=0.25, iou_thres=0.45, **kwargs)[source]¶
Bases:
YOLOBaseLightningYOLOv11 (YOLO11) object detection model wrapped for Lightning.
YOLOv11 is the latest iteration of the YOLO series featuring: - Improved C3k2 blocks for better feature extraction - Enhanced attention mechanisms - Better small object detection - Optimized architecture for efficiency
Available model sizes: - n (nano): Fastest, lowest accuracy - s (small): Fast with good accuracy - m (medium): Balanced speed/accuracy - l (large): High accuracy - x (extra-large): Highest accuracy
Warning
There is a known bug in the training pipeline that may cause
IndexError: too many indices for tensor of dimension 2during the loss computation. This is the same issue as YOLOv8. Investigation is ongoing to resolve this issue.- Parameters:
num_classes (
int) – Number of object classes (no background).model_size (
str) – Model size variant (“n”, “s”, “m”, “l”, “x”).pretrained (
bool) – If True, load COCO pretrained weights.conf_thres (
float) – Confidence threshold for predictions.iou_thres (
float) – IoU threshold for NMS.**kwargs (
Any) – Additional arguments for BaseLightningDetector.
Example
>>> # Create YOLOv11-large model >>> model = YOLOv11(num_classes=20, model_size="l") >>> >>> # Train with Lightning >>> trainer = Trainer(max_epochs=100) >>> trainer.fit(model, datamodule)
- MODEL_VARIANTS = {'l': 'yolo11l.pt', 'm': 'yolo11m.pt', 'n': 'yolo11n.pt', 's': 'yolo11s.pt', 'x': 'yolo11x.pt'}¶
Model Sizes:
Size |
Variant |
|---|---|
n (nano) |
|
s (small) |
|
m (medium) |
|
l (large) |
|
x (extra-large) |
|
from objdet.models.yolo import YOLOv11
model = YOLOv11(
num_classes=80,
model_size="l",
pretrained=True,
)
Warning
YOLOv11 has the same known training bug as YOLOv8.