New CodeProject.AI License Plate Recognition (YOLO11) Module

Any plan to release object detection Yolo11? Anyway I appreciate your hardwork on license plate recognition (Yolov11) it work great on my cameras.
I started working on the YOLO11 .Net module. I have been using GitHub Copilot to do most of the coding, below is the plan that GitHub Copilot came up with.

Plan: Create ObjectDetectionYOLO11Net Multi-Task Vision Module

Build a new YOLO11 module supporting Detection, Pose, Segment, and Classify tasks by adapting the YOLOv5 module structure. The module auto-detects task type from model filename suffixes (-pose, -seg, -cls), implements task-specific output parsers, and exposes separate API routes while maintaining backward compatibility with /vision/detection.

Steps:

1. Create project foundation: Copy ObjectDetectionYOLOv5Net.csproj to ObjectDetectionYOLO11Net/ObjectDetectionYOLO11Net.csproj, update <RootNamespace> to CodeProject.AI.Modules.ObjectDetection.YOLO11, <AssemblyName> to ObjectDetectionYOLO11Net, <Name> to Object Detection (YOLO .Net), and copy Program.cs updating namespace to CodeProject.AI.Modules.ObjectDetection.YOLO11 and service to ObjectDetectionYOLO11ModuleRunner.

2. Implement YOLO11 model hierarchy in YOLOv11/Models/: Create abstract Yolo11Model.cs extending YoloModel adding TaskType enum property (Detect/Pose/Segment/Classify), then concrete models inheriting from Yolo11Model: Yolo11DetectionModel.cs (640×640, 84 dims, 80 COCO labels from yolo11*.json), Yolo11PoseModel.cs (640×640, 56 dims, 1 "person" class, 17 keypoint names), Yolo11SegmentationModel.cs (640×640, 116 dims, 80 COCO labels, 32 mask coefficients), Yolo11ClassificationModel.cs (224×224, 1000 dims, ImageNet labels from yolo11*-cls.json).

3. Build unified scorer YOLOv11/Yolo11Scorer.cs: Adapt YoloScorer.cs with constructor detecting task from filename (Contains("-pose") → Pose, Contains("-seg") → Segment, Contains("-cls") → Classify, else → Detect), reuse ResizeImage() and ExtractPixels() unchanged, implement Predict() routing to task-specific parsers: ParseDetectionOutput() (handle [1,8400,84] with no objectness—find max at indices 4-83), ParsePoseOutput() (extract bbox from 0-4, class at 5, keypoints at 6-56 as 17×3 floats, apply coordinate transform (kp-pad)/gain), ParseSegmentationOutput() (extract bbox 0-4, classes 4-83, mask coeffs 84-115, generate masks from prototype tensor, extract polygon contours), ParseClassificationOutput() (softmax top-5 from flat [1,1000] output with confidence threshold).

4. Implement prediction classes in YOLOv11/: Create base Yolo11Prediction.cs with Label/Rectangle/Score matching YoloPrediction, extend with Yolo11PosePrediction adding Keypoint[] Keypoints (struct: float X, Y, Confidence) and static readonly int[][] Skeleton (19 COCO connections: nose-eyes, eyes-ears, shoulders, arms, torso, hips, legs), Yolo11SegmentationPrediction adding List<SKPoint> Contour (polygon from mask threshold), Yolo11ClassificationPrediction (no Rectangle, only List<ClassScore> with Label/Confidence).

5. Create module runner ObjectDetectionYOLO11ModuleRunner.cs: Adapt ObjectDetectionModuleRunner.cs updating namespace, add Process() handlers for commands detect (backward compatible /vision/detection route), detect-pose (/vision/pose), detect-segment (/vision/segment), classify (/vision/classify), implement GetStandardModelPath() mapping size→filename (tiny→yolo11n, small→yolo11s, medium→yolo11m, large→yolo11l, xlarge→yolo11x) with task suffix detection (-pose, -seg, -cls), create response builders BuildDetectionResponse() (existing DetectedObject array), BuildPoseResponse() (extend DetectedObject with keypoints field as float[]), BuildSegmentationResponse() (add contour field as int[]), BuildClassificationResponse() (new response with classifications array), keep existing list-custom and custom model routing with task detection from filename.

6. Configure settings and build system: Copy modulesettings.json updating Name to Object Detection (YOLO11 .NET), ModuleId to ObjectDetectionYOLO11Net, MODELS_DIR/CUSTOM_MODELS_DIR paths unchanged, add route maps for /vision/pose, /vision/segment, /vision/classify with appropriate input/output descriptions (pose outputs include keypoints and skeleton, segment includes contour, classify includes classifications with top parameter default 5), copy appsettings.json/Properties/launchSettings.json updating module name, adapt build scripts (Build.bat, install.bat, package.bat) replacing yolov5 with yolo11, copy test/home-office.jpg for self-test, implement SelfTest() running detection on test image.

Critical Implementation Details:

  • Tensor dimension handling: Check output.Dimensions order before parsing—YOLO11 may output [1,84,8400] or [1,8400,84]
  • YOLO11 confidence parsing: No objectness score (buffer[4] is first class)—iterate buffer[4] through buffer[83] to find max class confidence, compare directly against minConfidence
  • Pose keypoints: After NMS on bounding boxes, extract keypoints at indices 6-56 (17 keypoints × 3), apply same letterbox transform as bbox coordinates, filter by keypoint confidence >0.5
  • Segmentation masks: Multiply 32 coefficients (buffer[84-115]) with prototype tensor [1,32,160,160] using matrix math, apply sigmoid, threshold at 0.5, use SkiaSharp path tracing to extract contour polygon, resize/crop to bbox region
  • Classification: No bbox/NMS needed, apply softmax to logits, sort descending, take top-K (default 5), filter by minConfidence
  • Custom models: Detect task from filename pattern in custom-models/ (e.g., my-model-pose.onnx → Pose task)
  • Thread safety: Maintain lock on InferenceSession.Run(), use ConcurrentBag in parallel loops
  • GPU fallback: Wrap scorer initialization in try-catch with CPU fallback per ObjectDetector.cs lines 68-77

Validation:

  • Self-test uses test/home-office.jpg for detection (person, laptop, keyboard expected)
  • Pose test expects person detections with 17 keypoints
  • Segment test expects object masks with polygon contours
  • Existing /vision/detection route continues working with YOLO11 detection models


 
  • Like
Reactions: kennethpro01
The Object Detection (YOLO11 .NET) module is almost finished, Most likely I will release it tomorrow

1761362230187.png
1761362473214.png1761362558198.png

ObjectDetectionYOLO11Net Release Notes

Version 1.0.0-DirectML (Latest)


Release Date: January 24, 2025

Overview
Production-ready YOLO11 multi-task vision module with DirectML GPU acceleration for Windows. Supports object detection, pose estimation, instance segmentation, and image classification.

What's Fixed in This Release
  • ✅ DirectML GPU Acceleration Now Working: Fixed MSBuild platform detection using $([MSBuild]::IsOSPlatform('Windows')) instead of undefined $(IsWindows) property
  • ✅ Preprocessor Symbol Resolution: DirectML code now properly compiled into binary (verified with AppendExecutionProvider_DML presence)
  • ✅ Build Configuration: GpuType correctly set to "DirectML" on Windows x64 platforms
  • ✅ Package Validation: DirectML.dll (18MB) included and DirectML execution provider properly initialized

Features
  • DirectML GPU Acceleration: Hardware-accelerated inference using DirectX 12 and DirectML
  • Multi-Task Support:
    • Object Detection - Detect 80 COCO object classes
    • Pose Estimation - Detect human poses with 17 keypoints
    • Instance Segmentation - Segment objects with polygon contours
    • Image Classification - Classify images into 1000 ImageNet categories
  • YOLO11 Models: Full support for n/s/m/l/x model variants (20 models included)
  • Custom Model Support: Load your own trained YOLO11 ONNX models with automatic task detection
  • High Performance: Optimized tensor parsing with thread-safe inference and object pooling

System Requirements
  • OS: Windows 10/11 (64-bit)
  • GPU: DirectX 12 compatible GPU
  • Runtime: .NET 9.0
  • Memory: 8GB RAM minimum (16GB recommended)

API Routes
The module exposes the following endpoints:
  • /vision/detection - Object detection (80 COCO classes)
  • /vision/pose - Pose estimation (17 keypoints per person)
  • /vision/segment - Instance segmentation (polygonal contours)
  • /vision/classify - Image classification (1000 ImageNet classes)
  • /vision/custom/{model_name} - Custom model inference
  • /vision/custom/list - List available custom models

Package Contents
  • ObjectDetectionYOLO11Net.dll (main module with DirectML code compiled)
  • DirectML.dll (18MB - GPU acceleration library)
  • Microsoft.ML.OnnxRuntime.dll (ONNX Runtime with DirectML provider)
  • 20 Pre-trained YOLO11 models:
    • Detection: yolo11n.onnx, yolo11s.onnx, yolo11m.onnx, yolo11l.onnx, yolo11x.onnx
    • Pose: yolo11n-pose.onnx, yolo11s-pose.onnx, yolo11m-pose.onnx, yolo11l-pose.onnx, yolo11x-pose.onnx
    • Segmentation: yolo11n-seg.onnx, yolo11s-seg.onnx, yolo11m-seg.onnx, yolo11l-seg.onnx, yolo11x-seg.onnx
    • Classification: yolo11n-cls.onnx, yolo11s-cls.onnx, yolo11m-cls.onnx, yolo11l-cls.onnx, yolo11x-cls.onnx
  • Custom model support with automatic task detection

Technical Details
  • Build System: MSBuild uses intrinsic functions to detect Windows platform
  • DirectML Integration: ExecutionProvider appended with ORT_SEQUENTIAL mode
  • Optimization: GraphOptimizationLevel set to ORT_ENABLE_ALL
  • Fallback Support: Automatic fallback to CPU provider if DirectML initialization fails
  • Task Detection: Automatic task type detection from model filename suffixes:
    • -pose → Pose Estimation (17 keypoints)
    • -seg → Instance Segmentation (polygon contours)
    • -cls → Image Classification (224×224 input)
    • No suffix → Object Detection (80 COCO classes)
  • Thread Safety: Thread-safe inference with object pooling for optimal performance
  • Model Caching: Efficient multi-model caching for quick task switching

Known Limitations
  • DirectML only available on Windows platforms
  • Requires DirectX 12 compatible hardware
  • ARM64 builds use CPU (DirectML not available on ARM64)

Installation
  1. Extract the release package to your CodeProject.AI modules directory
  2. Ensure DirectX 12 runtime is installed (Windows 10/11 includes this)
  3. Verify GPU supports DirectX 12: Run dxdiag and check Feature Levels
  4. Module will automatically use DirectML on compatible systems

Verification
To confirm DirectML is active, check module logs for:
Code:
ObjectDetection (.NET YOLO11) setting ExecutionProvider = "DirectML"
InferenceLibrary = "DirectML"
InferenceDevice = "GPU"

API Usage Examples

Object Detection
Code:
POST /vision/detection
Content-Type: multipart/form-data

image: <file>
min_confidence: 0.4 (optional)

Pose Estimation
Code:
POST /vision/pose
Content-Type: multipart/form-data

image: <file>
min_confidence: 0.4 (optional)

Instance Segmentation
Code:
POST /vision/segment
Content-Type: multipart/form-data

image: <file>
min_confidence: 0.4 (optional)

Image Classification
Code:
POST /vision/classify
Content-Type: multipart/form-data

image: <file>
min_confidence: 0.2 (optional)
top: 5 (optional, default: 5)


 
What OS are you running CP.AI on? If you are running CP.AI on Windows it will use DirectML GPU which is faster the CUDA GPU
Windows 11. For some reason it won't choose DirectML w/ YOLO56.2, I''m sure I'm doing something wrong. I've been running this way for quite awhile.

Thanks Mike!
 
Windows 11. For some reason it won't choose DirectML w/ YOLO56.2, I''m sure I'm doing something wrong. I've been running this way for quite awhile.

Thanks Mike!
The Object Detection (YOLOv5 6.2) module does not work with DirectML it only works with CUDA. If you want to use DirectML you need to use the Object Detection (YOLOv5 .NET) module