I started working on the YOLO11 .Net module. I have been using GitHub Copilot to do most of the coding, below is the plan that GitHub Copilot came up with.Any plan to release object detection Yolo11? Anyway I appreciate your hardwork on license plate recognition (Yolov11) it work great on my cameras.
Plan: Create ObjectDetectionYOLO11Net Multi-Task Vision Module
Build a new YOLO11 module supporting Detection, Pose, Segment, and Classify tasks by adapting the YOLOv5 module structure. The module auto-detects task type from model filename suffixes (-pose, -seg, -cls), implements task-specific output parsers, and exposes separate API routes while maintaining backward compatibility with /vision/detection.
Steps:
1. Create project foundation: Copy ObjectDetectionYOLOv5Net.csproj to ObjectDetectionYOLO11Net/ObjectDetectionYOLO11Net.csproj, update <RootNamespace> to CodeProject.AI.Modules.ObjectDetection.YOLO11, <AssemblyName> to ObjectDetectionYOLO11Net, <Name> to Object Detection (YOLO .Net), and copy Program.cs updating namespace to CodeProject.AI.Modules.ObjectDetection.YOLO11 and service to ObjectDetectionYOLO11ModuleRunner.
2. Implement YOLO11 model hierarchy in YOLOv11/Models/: Create abstract Yolo11Model.cs extending YoloModel adding TaskType enum property (Detect/Pose/Segment/Classify), then concrete models inheriting from Yolo11Model: Yolo11DetectionModel.cs (640×640, 84 dims, 80 COCO labels from yolo11*.json), Yolo11PoseModel.cs (640×640, 56 dims, 1 "person" class, 17 keypoint names), Yolo11SegmentationModel.cs (640×640, 116 dims, 80 COCO labels, 32 mask coefficients), Yolo11ClassificationModel.cs (224×224, 1000 dims, ImageNet labels from yolo11*-cls.json).
3. Build unified scorer YOLOv11/Yolo11Scorer.cs: Adapt YoloScorer.cs with constructor detecting task from filename (Contains("-pose") → Pose, Contains("-seg") → Segment, Contains("-cls") → Classify, else → Detect), reuse ResizeImage() and ExtractPixels() unchanged, implement Predict() routing to task-specific parsers: ParseDetectionOutput() (handle [1,8400,84] with no objectness—find max at indices 4-83), ParsePoseOutput() (extract bbox from 0-4, class at 5, keypoints at 6-56 as 17×3 floats, apply coordinate transform (kp-pad)/gain), ParseSegmentationOutput() (extract bbox 0-4, classes 4-83, mask coeffs 84-115, generate masks from prototype tensor, extract polygon contours), ParseClassificationOutput() (softmax top-5 from flat [1,1000] output with confidence threshold).
4. Implement prediction classes in YOLOv11/: Create base Yolo11Prediction.cs with Label/Rectangle/Score matching YoloPrediction, extend with Yolo11PosePrediction adding Keypoint[] Keypoints (struct: float X, Y, Confidence) and static readonly int[][] Skeleton (19 COCO connections: nose-eyes, eyes-ears, shoulders, arms, torso, hips, legs), Yolo11SegmentationPrediction adding List<SKPoint> Contour (polygon from mask threshold), Yolo11ClassificationPrediction (no Rectangle, only List<ClassScore> with Label/Confidence).
5. Create module runner ObjectDetectionYOLO11ModuleRunner.cs: Adapt ObjectDetectionModuleRunner.cs updating namespace, add Process() handlers for commands detect (backward compatible /vision/detection route), detect-pose (/vision/pose), detect-segment (/vision/segment), classify (/vision/classify), implement GetStandardModelPath() mapping size→filename (tiny→yolo11n, small→yolo11s, medium→yolo11m, large→yolo11l, xlarge→yolo11x) with task suffix detection (-pose, -seg, -cls), create response builders BuildDetectionResponse() (existing DetectedObject array), BuildPoseResponse() (extend DetectedObject with keypoints field as float[]), BuildSegmentationResponse() (add contour field as int[]), BuildClassificationResponse() (new response with classifications array), keep existing list-custom and custom model routing with task detection from filename.
6. Configure settings and build system: Copy modulesettings.json updating Name to Object Detection (YOLO11 .NET), ModuleId to ObjectDetectionYOLO11Net, MODELS_DIR/CUSTOM_MODELS_DIR paths unchanged, add route maps for /vision/pose, /vision/segment, /vision/classify with appropriate input/output descriptions (pose outputs include keypoints and skeleton, segment includes contour, classify includes classifications with top parameter default 5), copy appsettings.json/Properties/launchSettings.json updating module name, adapt build scripts (Build.bat, install.bat, package.bat) replacing yolov5 with yolo11, copy test/home-office.jpg for self-test, implement SelfTest() running detection on test image.
Critical Implementation Details:
- Tensor dimension handling: Check output.Dimensions order before parsing—YOLO11 may output [1,84,8400] or [1,8400,84]
- YOLO11 confidence parsing: No objectness score (buffer[4] is first class)—iterate buffer[4] through buffer[83] to find max class confidence, compare directly against minConfidence
- Pose keypoints: After NMS on bounding boxes, extract keypoints at indices 6-56 (17 keypoints × 3), apply same letterbox transform as bbox coordinates, filter by keypoint confidence >0.5
- Segmentation masks: Multiply 32 coefficients (buffer[84-115]) with prototype tensor [1,32,160,160] using matrix math, apply sigmoid, threshold at 0.5, use SkiaSharp path tracing to extract contour polygon, resize/crop to bbox region
- Classification: No bbox/NMS needed, apply softmax to logits, sort descending, take top-K (default 5), filter by minConfidence
- Custom models: Detect task from filename pattern in custom-models/ (e.g., my-model-pose.onnx → Pose task)
- Thread safety: Maintain lock on InferenceSession.Run(), use ConcurrentBag in parallel loops
- GPU fallback: Wrap scorer initialization in try-catch with CPU fallback per ObjectDetector.cs lines 68-77
Validation:
- Self-test uses test/home-office.jpg for detection (person, laptop, keyboard expected)
- Pose test expects person detections with 17 keypoints
- Segment test expects object masks with polygon contours
- Existing /vision/detection route continues working with YOLO11 detection models


