mirror of
https://github.com/saymrwulf/onnxruntime.git
synced 2026-05-19 21:32:23 +00:00
Update quantization_defs.cc (#14380)
### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->
This commit is contained in:
parent
2d8ee5251c
commit
de7a868d5f
2 changed files with 6 additions and 0 deletions
|
|
@ -2408,6 +2408,8 @@ This version of the operator has been available since version 1 of the 'com.micr
|
|||
#### Attributes
|
||||
|
||||
<dl>
|
||||
<dt><tt>mask_filter_value</tt> : float</dt>
|
||||
<dd>The value to be filled in the attention mask. Default value is -10000.0f</dd>
|
||||
<dt><tt>num_heads</tt> : int (required)</dt>
|
||||
<dd>Number of attention heads</dd>
|
||||
<dt><tt>past_present_share_buffer</tt> : int</dt>
|
||||
|
|
|
|||
|
|
@ -952,6 +952,10 @@ ONNX_MS_OPERATOR_SET_SCHEMA(
|
|||
.Attr("past_present_share_buffer", "Corresponding past and present are same tensor, its shape is "
|
||||
"(2, batch_size, num_heads, max_sequence_length, head_size)",
|
||||
AttributeProto::INT, OPTIONAL_VALUE)
|
||||
.Attr("mask_filter_value",
|
||||
"The value to be filled in the attention mask. Default value is -10000.0f",
|
||||
AttributeProto::FLOAT,
|
||||
OPTIONAL_VALUE)
|
||||
.Attr("scale",
|
||||
"Custom scale will be used if specified. Default value is 1/sqrt(head_size)",
|
||||
AttributeProto::FLOAT,
|
||||
|
|
|
|||
Loading…
Reference in a new issue