New CodeProject.AI License Plate Recognition (YOLO11) Module

Any plan to release object detection Yolo11? Anyway I appreciate your hardwork on license plate recognition (Yolov11) it work great on my cameras.
I started working on the YOLO11 .Net module. I have been using GitHub Copilot to do most of the coding, below is the plan that GitHub Copilot came up with.

Plan: Create ObjectDetectionYOLO11Net Multi-Task Vision Module

Build a new YOLO11 module supporting Detection, Pose, Segment, and Classify tasks by adapting the YOLOv5 module structure. The module auto-detects task type from model filename suffixes (-pose, -seg, -cls), implements task-specific output parsers, and exposes separate API routes while maintaining backward compatibility with /vision/detection.

Steps:

1. Create project foundation: Copy ObjectDetectionYOLOv5Net.csproj to ObjectDetectionYOLO11Net/ObjectDetectionYOLO11Net.csproj, update <RootNamespace> to CodeProject.AI.Modules.ObjectDetection.YOLO11, <AssemblyName> to ObjectDetectionYOLO11Net, <Name> to Object Detection (YOLO .Net), and copy Program.cs updating namespace to CodeProject.AI.Modules.ObjectDetection.YOLO11 and service to ObjectDetectionYOLO11ModuleRunner.

2. Implement YOLO11 model hierarchy in YOLOv11/Models/: Create abstract Yolo11Model.cs extending YoloModel adding TaskType enum property (Detect/Pose/Segment/Classify), then concrete models inheriting from Yolo11Model: Yolo11DetectionModel.cs (640×640, 84 dims, 80 COCO labels from yolo11*.json), Yolo11PoseModel.cs (640×640, 56 dims, 1 "person" class, 17 keypoint names), Yolo11SegmentationModel.cs (640×640, 116 dims, 80 COCO labels, 32 mask coefficients), Yolo11ClassificationModel.cs (224×224, 1000 dims, ImageNet labels from yolo11*-cls.json).

3. Build unified scorer YOLOv11/Yolo11Scorer.cs: Adapt YoloScorer.cs with constructor detecting task from filename (Contains("-pose") → Pose, Contains("-seg") → Segment, Contains("-cls") → Classify, else → Detect), reuse ResizeImage() and ExtractPixels() unchanged, implement Predict() routing to task-specific parsers: ParseDetectionOutput() (handle [1,8400,84] with no objectness—find max at indices 4-83), ParsePoseOutput() (extract bbox from 0-4, class at 5, keypoints at 6-56 as 17×3 floats, apply coordinate transform (kp-pad)/gain), ParseSegmentationOutput() (extract bbox 0-4, classes 4-83, mask coeffs 84-115, generate masks from prototype tensor, extract polygon contours), ParseClassificationOutput() (softmax top-5 from flat [1,1000] output with confidence threshold).

4. Implement prediction classes in YOLOv11/: Create base Yolo11Prediction.cs with Label/Rectangle/Score matching YoloPrediction, extend with Yolo11PosePrediction adding Keypoint[] Keypoints (struct: float X, Y, Confidence) and static readonly int[][] Skeleton (19 COCO connections: nose-eyes, eyes-ears, shoulders, arms, torso, hips, legs), Yolo11SegmentationPrediction adding List<SKPoint> Contour (polygon from mask threshold), Yolo11ClassificationPrediction (no Rectangle, only List<ClassScore> with Label/Confidence).

5. Create module runner ObjectDetectionYOLO11ModuleRunner.cs: Adapt ObjectDetectionModuleRunner.cs updating namespace, add Process() handlers for commands detect (backward compatible /vision/detection route), detect-pose (/vision/pose), detect-segment (/vision/segment), classify (/vision/classify), implement GetStandardModelPath() mapping size→filename (tiny→yolo11n, small→yolo11s, medium→yolo11m, large→yolo11l, xlarge→yolo11x) with task suffix detection (-pose, -seg, -cls), create response builders BuildDetectionResponse() (existing DetectedObject array), BuildPoseResponse() (extend DetectedObject with keypoints field as float[]), BuildSegmentationResponse() (add contour field as int[]), BuildClassificationResponse() (new response with classifications array), keep existing list-custom and custom model routing with task detection from filename.

6. Configure settings and build system: Copy modulesettings.json updating Name to Object Detection (YOLO11 .NET), ModuleId to ObjectDetectionYOLO11Net, MODELS_DIR/CUSTOM_MODELS_DIR paths unchanged, add route maps for /vision/pose, /vision/segment, /vision/classify with appropriate input/output descriptions (pose outputs include keypoints and skeleton, segment includes contour, classify includes classifications with top parameter default 5), copy appsettings.json/Properties/launchSettings.json updating module name, adapt build scripts (Build.bat, install.bat, package.bat) replacing yolov5 with yolo11, copy test/home-office.jpg for self-test, implement SelfTest() running detection on test image.

Critical Implementation Details:

  • Tensor dimension handling: Check output.Dimensions order before parsing—YOLO11 may output [1,84,8400] or [1,8400,84]
  • YOLO11 confidence parsing: No objectness score (buffer[4] is first class)—iterate buffer[4] through buffer[83] to find max class confidence, compare directly against minConfidence
  • Pose keypoints: After NMS on bounding boxes, extract keypoints at indices 6-56 (17 keypoints × 3), apply same letterbox transform as bbox coordinates, filter by keypoint confidence >0.5
  • Segmentation masks: Multiply 32 coefficients (buffer[84-115]) with prototype tensor [1,32,160,160] using matrix math, apply sigmoid, threshold at 0.5, use SkiaSharp path tracing to extract contour polygon, resize/crop to bbox region
  • Classification: No bbox/NMS needed, apply softmax to logits, sort descending, take top-K (default 5), filter by minConfidence
  • Custom models: Detect task from filename pattern in custom-models/ (e.g., my-model-pose.onnx → Pose task)
  • Thread safety: Maintain lock on InferenceSession.Run(), use ConcurrentBag in parallel loops
  • GPU fallback: Wrap scorer initialization in try-catch with CPU fallback per ObjectDetector.cs lines 68-77

Validation:

  • Self-test uses test/home-office.jpg for detection (person, laptop, keyboard expected)
  • Pose test expects person detections with 17 keypoints
  • Segment test expects object masks with polygon contours
  • Existing /vision/detection route continues working with YOLO11 detection models


 
  • Like
Reactions: kennethpro01
The Object Detection (YOLO11 .NET) module is almost finished, Most likely I will release it tomorrow

1761362230187.png
1761362473214.png1761362558198.png

ObjectDetectionYOLO11Net Release Notes

Version 1.0.0-DirectML (Latest)


Release Date: January 24, 2025

Overview
Production-ready YOLO11 multi-task vision module with DirectML GPU acceleration for Windows. Supports object detection, pose estimation, instance segmentation, and image classification.

What's Fixed in This Release
  • ✅ DirectML GPU Acceleration Now Working: Fixed MSBuild platform detection using $([MSBuild]::IsOSPlatform('Windows')) instead of undefined $(IsWindows) property
  • ✅ Preprocessor Symbol Resolution: DirectML code now properly compiled into binary (verified with AppendExecutionProvider_DML presence)
  • ✅ Build Configuration: GpuType correctly set to "DirectML" on Windows x64 platforms
  • ✅ Package Validation: DirectML.dll (18MB) included and DirectML execution provider properly initialized

Features
  • DirectML GPU Acceleration: Hardware-accelerated inference using DirectX 12 and DirectML
  • Multi-Task Support:
    • Object Detection - Detect 80 COCO object classes
    • Pose Estimation - Detect human poses with 17 keypoints
    • Instance Segmentation - Segment objects with polygon contours
    • Image Classification - Classify images into 1000 ImageNet categories
  • YOLO11 Models: Full support for n/s/m/l/x model variants (20 models included)
  • Custom Model Support: Load your own trained YOLO11 ONNX models with automatic task detection
  • High Performance: Optimized tensor parsing with thread-safe inference and object pooling

System Requirements
  • OS: Windows 10/11 (64-bit)
  • GPU: DirectX 12 compatible GPU
  • Runtime: .NET 9.0
  • Memory: 8GB RAM minimum (16GB recommended)

API Routes
The module exposes the following endpoints:
  • /vision/detection - Object detection (80 COCO classes)
  • /vision/pose - Pose estimation (17 keypoints per person)
  • /vision/segment - Instance segmentation (polygonal contours)
  • /vision/classify - Image classification (1000 ImageNet classes)
  • /vision/custom/{model_name} - Custom model inference
  • /vision/custom/list - List available custom models

Package Contents
  • ObjectDetectionYOLO11Net.dll (main module with DirectML code compiled)
  • DirectML.dll (18MB - GPU acceleration library)
  • Microsoft.ML.OnnxRuntime.dll (ONNX Runtime with DirectML provider)
  • 20 Pre-trained YOLO11 models:
    • Detection: yolo11n.onnx, yolo11s.onnx, yolo11m.onnx, yolo11l.onnx, yolo11x.onnx
    • Pose: yolo11n-pose.onnx, yolo11s-pose.onnx, yolo11m-pose.onnx, yolo11l-pose.onnx, yolo11x-pose.onnx
    • Segmentation: yolo11n-seg.onnx, yolo11s-seg.onnx, yolo11m-seg.onnx, yolo11l-seg.onnx, yolo11x-seg.onnx
    • Classification: yolo11n-cls.onnx, yolo11s-cls.onnx, yolo11m-cls.onnx, yolo11l-cls.onnx, yolo11x-cls.onnx
  • Custom model support with automatic task detection

Technical Details
  • Build System: MSBuild uses intrinsic functions to detect Windows platform
  • DirectML Integration: ExecutionProvider appended with ORT_SEQUENTIAL mode
  • Optimization: GraphOptimizationLevel set to ORT_ENABLE_ALL
  • Fallback Support: Automatic fallback to CPU provider if DirectML initialization fails
  • Task Detection: Automatic task type detection from model filename suffixes:
    • -pose → Pose Estimation (17 keypoints)
    • -seg → Instance Segmentation (polygon contours)
    • -cls → Image Classification (224×224 input)
    • No suffix → Object Detection (80 COCO classes)
  • Thread Safety: Thread-safe inference with object pooling for optimal performance
  • Model Caching: Efficient multi-model caching for quick task switching

Known Limitations
  • DirectML only available on Windows platforms
  • Requires DirectX 12 compatible hardware
  • ARM64 builds use CPU (DirectML not available on ARM64)

Installation
  1. Extract the release package to your CodeProject.AI modules directory
  2. Ensure DirectX 12 runtime is installed (Windows 10/11 includes this)
  3. Verify GPU supports DirectX 12: Run dxdiag and check Feature Levels
  4. Module will automatically use DirectML on compatible systems

Verification
To confirm DirectML is active, check module logs for:
Code:
ObjectDetection (.NET YOLO11) setting ExecutionProvider = "DirectML"
InferenceLibrary = "DirectML"
InferenceDevice = "GPU"

API Usage Examples

Object Detection
Code:
POST /vision/detection
Content-Type: multipart/form-data

image: <file>
min_confidence: 0.4 (optional)

Pose Estimation
Code:
POST /vision/pose
Content-Type: multipart/form-data

image: <file>
min_confidence: 0.4 (optional)

Instance Segmentation
Code:
POST /vision/segment
Content-Type: multipart/form-data

image: <file>
min_confidence: 0.4 (optional)

Image Classification
Code:
POST /vision/classify
Content-Type: multipart/form-data

image: <file>
min_confidence: 0.2 (optional)
top: 5 (optional, default: 5)