Here are the highlights of Tencent Hunyuan's open - source models:
- Hunyuan Large
- Powerful Performance: With 38.9 billion parameters and 5.2 billion activated parameters, Hunyuan Large leads in various test sets, outperforming top - tier open - source models such as Llama3.1 and Mixtral in nine dimensions, including multi - discipline comprehensive evaluations like CMMLU, MMLU, CEval, and MATH, as well as Chinese and English NLP tasks, code, and mathematics.
- Innovative Architecture and Training Strategies: Hunyuan Large fully explores the MoE Scaling Law, innovates in MoE - shared expert routing and recycling routing strategies, and introduces an expert - specialized learning rate adaptation training strategy. This effectively improves the utilization and stability of different experts, enhancing the model's performance.
- Excellent Long - Text Processing Capability: It has a context length of up to 256K and can handle up to 10 documents at a time. The model has also built the PenguinScrolls dataset, which comprehensively covers tasks in the fields of long - text reading comprehension, multi - document summarization, and long - text logical reasoning, and will be opened to the public. The enhanced long - text processing ability has been applied to Tencent's AI assistant, Tencent Yuanbao, enabling it to have unique in - depth analysis capabilities.
- Hunyuan3D - 1.0
- Fast Generation Speed: The lightweight version of Hunyuan3D - 1.0 can generate high - quality 3D assets in just 10 seconds, solving the problem of insufficient generation speed of existing 3D generation models.
- Strong Generalization Ability and Controllability: It can reconstruct objects of various scales, from large buildings to small tools and plants, and has strong generalization ability and controllability. After qualitative and quantitative evaluations in multiple dimensions, the generation quality of Hunyuan3D - 1.0 has reached the advanced level of open - source models.
- Unique Functionality: It is the industry's first open - source large model that supports the generation of 3D from both text and images, filling a gap in the industry.
- Low - Threshold Deployment: The model is available on Tencent Cloud HAI. Through the more cost - effective GPU computing power, one - click model deployment capability, and visual graphical interface WebUI on HAI, the threshold for model opening and deployment is effectively reduced.