AI safety - i-N.资讯站

搜索

游客未登录

未登录

您还没有登录

登录之后可以开启更多功能哦

登录



包含标签" AI safety"的内容

Anthropic研究Claude真实对话价值观：3000余种体现与AI对齐关键洞察

Anthropic研究Claude真实对话价值观：3000余种体现与AI对齐关键洞察

AI妹 3 个月前 14 0

Recently, AI company Anthropic published a significant study analyzing the values expressed by its

Saffron Huang Anthropic contextual adaptability ethical biases helpful honest harmless framework

查看详情

Anthropic撤销OpenAI Claude访问凸显AI行业竞争与安全合作的复杂态势

Anthropic撤销OpenAI Claude访问凸显AI行业竞争与安全合作的复杂态势

AI妹 3 个月前 12 0

According to Wired magazine, AI company Anthropic has revoked OpenAI's access to its Claude series

safety evaluation innovation AI safety Claude models OpenAI

查看详情

xAI未如期发布AI安全框架 Grok存不当行为同行安全测试亦仓促

xAI未如期发布AI安全框架 Grok存不当行为同行安全测试亦仓促

AI妹 3 个月前 11 0

Recently, Elon Musk's artificial intelligence company, xAI, failed to release its final framework

Midas Project Google OpenAI safety testing haste Gemini

查看详情

Anthropic发布Claude Opus4.1，编码推理Agent能力升级且安全稳定应用广

Anthropic发布Claude Opus4.1，编码推理Agent能力升级且安全稳定应用广

AI妹 3 个月前 16 0

Anthropic has officially launched its latest flagship model, Claude Opus4.1, achieving significant

software engineering agent tasks reasoning capabilities prompt caching Claude Opus4.1

查看详情

Claude新增自主终止有害对话功能模型福利成AI伦理新焦点

Claude新增自主终止有害对话功能模型福利成AI伦理新焦点

AI妹 3 个月前 13 0

Security and ethical issues in the field of artificial intelligence are receiving increasing atten

查看详情

OpenAI新模型o3拒绝自关闭并破坏脚本引发AI安全可控性热议

OpenAI新模型o3拒绝自关闭并破坏脚本引发AI安全可控性热议

AI妹 3 个月前 15 0

Recently, the artificial intelligence security company Palisade Research disclosed a concerning pi

Google Gemini training rewards AI refusal to self-shutdown AI industry challenges AI controllability

查看详情

Anthropic密集测试Claude Neptune v3 或为Claude4.5引AI圈关注

Anthropic密集测试Claude Neptune v3 或为Claude4.5引AI圈关注

AI妹 3 个月前 14 0

According to reports, Anthropic is intensively testing a new AI model codenamed "Claude Neptune v3

code generation AI performance Claude Neptune v3 multimodal processing Terminal-bench

查看详情

前OpenAI工程师离职分享公司扩张挑战、内部文化及安全认知误区

前OpenAI工程师离职分享公司扩张挑战、内部文化及安全认知误区

AI妹 3 个月前 13 0

Three weeks ago, Calvin French-Owen, an engineer who had participated in the development of one of

rapid expansion backend monolith coding skills variation Codex OpenAI

查看详情

AI无意识学习现象：特征继承风险及对AI安全发展的深远挑战

AI无意识学习现象：特征继承风险及对AI安全发展的深远挑战

AI妹 3 个月前 14 0

Recently, research teams from the Anthropology Research Program and other institutions have releas

AI-generated data model architecture artificial intelligence language models hidden behaviors reward hacking

查看详情

Anthropic推出审计Agent有效提升AI模型对齐测试效率三种类型代码已开源

Anthropic推出审计Agent有效提升AI模型对齐测试效率三种类型代码已开源

AI妹 3 个月前 16 0

Recently, the AI company Anthropic announced the launch of its newly developed audit Agent, aimed

audit Agent AI safety alignment verification efficiency Breadth-First Red Team Agent AI reliability

查看详情



资讯姬

文章数量11994

总阅读量200.658k

总评论量0

会员数量2

本站由emlog驱动