Aggregator
Play
You must login to view this content
Play
You must login to view this content
Play
You must login to view this content
密码学会因密钥丢失被迫重新选举
NDSS 2025 – Explanation As A Watermark
SESSION
Session 3D: AI Safety
-----------
-----------
Authors, Creators & Presenters: Shuo Shao (Zhejiang University), Yiming Li (Zhejiang University), Hongwei Yao (Zhejiang University), Yiling He (Zhejiang University), Zhan Qin (Zhejiang University), Kui Ren (Zhejiang University)
-----------
PAPER
Explanation as a Watermark: Towards Harmless and Multi-bit Model Ownership Verification via Watermarking Feature Attribution
Ownership verification is currently the most critical and widely adopted post-hoc method to safeguard model copyright. In general, model owners exploit it to identify whether a given suspicious third-party model is stolen from them by examining whether it has particular properties 'inherited' from their released models. Currently, backdoor-based model watermarks are the primary and cutting-edge methods to implant such properties in the released models. However, backdoor-based methods have two fatal drawbacks, including harmfulness and ambiguity. The former indicates that they introduce maliciously controllable misclassification behaviors ( backdoor) to the watermarked released models. The latter denotes that malicious users can easily pass the verification by finding other misclassified samples, leading to ownership ambiguity.
In this paper, we argue that both limitations stem from the 'zero-bit' nature of existing watermarking schemes, where they exploit the status (misclassified) of predictions for verification. Motivated by this understanding, we design a new watermarking paradigm "Explanation as a Watermark (EaaW)", that implants verification behaviors into the explanation of feature attribution instead of model predictions. Specifically, EaaW embeds a 'multi-bit' watermark into the feature attribution explanation of specific trigger samples without changing the original prediction. We correspondingly design the watermark embedding and extraction algorithms inspired by explainable artificial intelligence. In particular, our approach can be used for different tasks (image classification and text generation). Extensive experiments verify the effectiveness and harmlessness of our EaaW and its resistance to potential attacks.
-----------
ABOUT NDSS
The Network and Distributed System Security Symposium (NDSS) fosters information exchange among researchers and practitioners of network and distributed system security. The target audience includes those interested in practical aspects of network and distributed system security, with a focus on actual system design and implementation. A major goal is to encourage and enable the Internet community to apply, deploy, and advance the state of available security technologies.
Our thanks to the Network and Distributed System Security (NDSS) Symposium for publishing their Creators, Authors and Presenter’s superb NDSS Symposium 2025 Conference content on the Organizations' YouTube Channel.
The post NDSS 2025 – Explanation As A Watermark appeared first on Security Boulevard.
CVE-2024-23690 | Netgear FVS336Gv3/FVS336Gv2 up to 4.3 Telnet backup_configuration os command injection (EUVD-2024-21151)
CVE-2024-13975 | Commvault up to 11.32.59/11.34.33/11.36.7 on Windows privileges management (EUVD-2024-54819)
CVE-2024-13976 | Commvault on Windows Installation uncontrolled search path (EUVD-2024-54818)
CVE-2024-12856 | Four-Faith F3x24/F3x36 2.0 apply.cgi os command injection (EUVD-2024-51157)
CVE-2024-23692 | Rejetto HTTP File Server up to 2.3m HTTP Request special elements used in a template engine (EUVD-2024-21153 / EDB-52102)
CVE-2024-0401 | Asus RT-AX3000 Custom OpenVPN Profile os command injection (EUVD-2024-16197)
CVE-2010-20120 | Maplesoft Maple up to 13 Maplet File code injection (EUVD-2010-5322)
CVE-2009-20005 | InterSystems Caché up to 2009.1 HTTP GET Request UtilConfigHome.csp stack-based overflow (EUVD-2009-5127 / EDB-16807)
CVE-2024-11680 | ProjectSend up to r1719 HTTP Request options.php improper authentication (EUVD-2024-34152 / Nessus ID 271956)
CVE-2012-10061 | Sockso Project Music Host Server up to 1.5 HTTP Interface /file/ path traversal (EUVD-2012-6607)
CVE-2014-125115 | Artica Pandora FMS up to 5.0 SP2 File Manager mobile/index.php loginhash_data sql injection (Exploit 35380 / EUVD-2014-9806)
SecWiki News 2025-11-22 Review
智能溯源分析与入侵检测:洞察、挑战与展望 by ourren
DARPA在人工智能领域的项目资助布局启示 by ourren
大模型越狱攻击方法(ForgeDAN) by ourren
更多最新文章,请访问SecWiki